Merge rows (diff)

The Merge rows (diff) step compares and merges data within two input streams. This step is useful for comparing data collected at two different times. For example, if the source system of your data warehouse does not contain a timestamp of the last data update, you can use this step to compare two data streams and flag changes.

Based on keys for comparison, this step merges reference rows (previous data) with compare rows (new data) and creates merged output rows. A flag in the row indicates how the values were compared and merged.

Flag values include:

  • identical: The key was found in both rows, and the compared values are identical.

  • changed: The key was found in both rows, but one or more compared values are different.

  • new: The key was not found in the reference rows.

  • deleted: The key was not found in the compare rows.

If the rows are flagged as deleted, the merged output rows are created based on the reference rows stream.

For identical, new, or changed rows, the merged output rows are created based on the compare rows stream.

You can send values from the merged and flagged rows to a subsequent step in your transformation, such as Switch-Case, and then use the flag field to control inserts, updates, and deletes on a target table.

Step name

  • Step name: Specify the unique name of the step on the canvas. You can customize the name or leave it as the default.

Options

Options for Merge rows (diff)

The Merge rows (diff) step contains the following options.

Option
Description

Reference rows origin

Select the input source for the reference rows you want to compare (a previous step in the transformation).

Compare rows origin

Select the input source for the compare rows (a previous step in the transformation).

Flag fieldname

Specify the field name that will contain the comparison result flag in the output row.

Keys to match

Field names that contain the keys used for matching.

Values to compare

Field names that contain the values used for comparison.

Get key fields

Select to populate the Keys to match table with all fields in the Reference rows origin step.

Get value fields

Select to populate the Values to compare table with all fields in the Compare rows origin step.

Example

The Merge rows – mergs 2 streams of data and add a flag.ktr transformation in the data-integration/samples/transformations directory illustrates how to use the Merge rows (diff) step.

Metadata injection support

This step supports metadata injection. You can use it with ETL metadata injection to pass metadata to your transformation at runtime.

Last updated

Was this helpful?