Sample use cases

Data lineage and impact analysis can be applicable in several ways.

  • As an ETL Developer:

    • There are changes in my source system, such as fields which are added, deleted and renamed. What parts of my ETL processes need to adapt? (Impact Analysis)

    • I need additional information in my target system, such as for reports. What sources are can provide this additional information? (Data Lineage)

  • As a Data Steward:

    • There is a need for auditability and transparency to determine where data is coming from. A global, company-wide, metadata repository needs data lineage information from different systems and applications, i.e. very fine-grained metadata.

    • What elements (fields, tables, etc.) in my ETL processes are never used? How many times is a specific element used in some or all of my ETL processes?

  • As a Report/Business User:

    • Is my data accurate?

    • I want to find reports which include specific information from a source, such as a field. This process is "data discovery." For example, are there any data sources which include sales and gender? Are there any reports which include sales and zip codes?

  • As a Troubleshooting Operator:

    • The numbers in the report are wrong (or supposed to be wrong). What processes (transformations, jobs) are involved to help me determine where these numbers are coming from?

    • A job or transformation did not finish successfully. What target tables and fields are affected which are used in the reports?

  • As an Administrator:

    • For documentation and auditing purposes, I want to have a report on external sources and target fields, tables, and databases of my ETL processes. I need the data for a specific date and version.

    • To ensure compliance, I want to validate naming conventions of artifacts (fields, tables, etc.)

    • For integration into third-party data lineage tools, I want a flexible way of exporting the collected data lineage information.

Last updated

Was this helpful?