Configure a Teradata Vantage connection in DataHub


This how-to demonstrates how to create a connection to Teradata Vantage with DataHub, and ingest metadata about tables and views, along with usage and lineage information.


Setup DataHub

  • Install the Teradata plugin for DataHub in the environment where you have DataHub installed

    pip install 'acryl-datahub[teradata]'
  • Setup a Teradata user and set privileges to allow that user to read the dictionary tables

    CREATE USER datahub FROM <database> AS PASSWORD = <password> PERM = 20000000;
    GRANT SELECT ON dbc.columns TO datahub;
    GRANT SELECT ON dbc.databases TO datahub;
    GRANT SELECT ON dbc.tables TO datahub;
    GRANT SELECT ON DBC.All_RI_ChildrenV TO datahub;
    GRANT SELECT ON DBC.ColumnsV TO datahub;
    GRANT SELECT ON DBC.IndicesV TO datahub;
    GRANT SELECT ON dbc.TableTextV TO datahub;
    GRANT SELECT ON dbc.TablesV TO datahub;
    GRANT SELECT ON dbc.dbqlogtbl TO datahub; -- if lineage or usage extraction is enabled
  • If you want to run profiling, you need to grant select permission on all the tables you want to profile.

  • If you want to extract lineage or usage metadata, query logging must be enabled and it is set to size which will fit for your queries (the default query text size Teradata captures is max 200 chars) An example how you can set it for all users:

    -- set up query logging on all

Add a Teradata connection to DataHub

With DataHub running, open the DataHub GUI and login. In this example this is running at localhost:9002

  1. Start the new connection wizard by clicking on the ingestion plug icon

    Ingestion Label

    and then selecting "Create new source"

    Create New Source
  2. Scroll the list of available sources and select Other

    Select Source
  3. A recipe is needed to configure the connection to Teradata and define the options required such as whether to capture table and column lineage, profile the data or retrieve usage statistics. Below is a simple recipe to get you started. The host, username and password should be changed to match your environment.

    pipeline_name: my-teradata-ingestion-pipeline
      type: teradata
        host_port: ""
        username: myuser
        password: mypassword
        #  allow:
        #    - "my_database"
        #  ignoreCase: true
        include_table_lineage: true
        include_usage_statistics: true
          enabled: true

    Pasting the recipe into the window should look like this:

    New Ingestion Source
  4. Click Next and then setup the required schedule.

    Set Schedule
  5. Click Next to Finish Up and give the connection a name. Click Advanced so that the correct CLI version can be set. DataHub support for Teradata became available in CLI 0.12.x. Suggest selecting the most current version to ensure the best compatibility.

    Finish up
  6. Once the new source has been saved, it can be executed manually by clicking Run.


    Clicking on "Succeeded" after a sucessful execution will bring up a dialogue similar to this one where you can see the Databases, Tables and Views that have been ingested into DataHub.

    Ingestion Result
  7. The metadata can now be explored in the GUI by browsing:

    1. DataSets provides a list of the datasets (tables and views) loaded

    2. Entities captured from the database

    3. Schema of an entity showing column/field names, data types and usage if it has been captured

      Schema display
    4. Lineage providing a visual representation of how data is linked between tables and views

      Lineage picture


This how-to demonstrated how to create a connection to Teradata Vantage with DataHub in order to capture metadata of tables, views along with lineage and usage statistics.

Further reading

If you have any questions or need further assistance, please visit our community forum where you can get support and interact with other community members.
Did this page help?