Transformation Executor

The Transformation Executor step runs a Pentaho Data Integration (PDI) transformation from within another transformation.

It is similar to the Job Executor step, but it runs transformations.

How it works

Depending on your use case, you can configure this step to:

  • Execute the specified transformation once per input row (default). The incoming row can be used to set parameters and variables.

  • Execute the specified transformation once per group of rows, based on changes in a field value or after collecting rows for a specified duration.

  • Run multiple copies of the step to enable parallel execution (depending on your overall transformation design).

Important notes

circle-exclamation
circle-info

For performance reasons, parent transformation logging typically contains only the last processed batch.

To capture child-transformation logs, set a target step in Execution results and include the Execution logging text field (default ExecutionLogText).

Samples

Sample transformations demonstrating this step are available in:

design-tools/data-integration/samples/transformations/transformation-executor

  • trans-executor-child.ktr: Adds a sequence to input rows.

  • trans-executor-parent.ktr: Passes rows to a transformation that is executed three times. You can preview the Results, Result files, and Result rows steps to view output.

Samples

Step name and transformation

Option
Description

Step name

Specifies the unique name of the step on the canvas. Default: Transformation Executor.

Transformation

Specify the transformation to execute by entering its path or selecting Browse. If you select a transformation that has the same root path as the current transformation, PDI automatically inserts ${Internal.Entry.Current.Directory} in place of the common root path. Example: if the current transformation is /home/admin/transformation.ktr and you select /home/admin/path/sub.ktr, the path becomes ${Internal.Entry.Current.Directory}/path/sub.ktr. If you are working with a repository, specify the transformation name. If you are not working with a repository, specify the XML file name of the transformation. Transformations previously specified by reference are converted automatically to use the transformation name in the Pentaho Repository.

Configure the step (tabs)

The Transformation Executor step includes the following tabs:

  • Parameters

  • Execution results

  • Row grouping

  • Result rows

  • Result files

Parameters tab

Use this tab to define or pass variables and parameters to the child transformation.

If multiple rows are passed to the child transformation, the first row in the group is used to set parameters/variables.

For each entry you add, assign a value in either Variable / Parameter to use or Static input value (not both).

Option
Description

Variable / Parameter name

Name of the variable or parameter to pass to the child transformation. This name must be unique in the table.

Variable / Parameter to use

Source for the value: an incoming field, a manually entered variable name, or a selected internal variable (Ctrl+Space). You can also enter ${...} notation to pass a variable reference rather than the resolved field value. When set, Static input value is disabled.

Static input value

Constant value to use. When set, Variable / Parameter to use is disabled.

Inherit all variables from transformation

When selected, variables from the parent transformation are also available to the child transformation. See Order of processing.

Get Parameters

Inserts parameters defined in the child transformation. The parameter description is inserted into Static input value.

Order of processing

How variables and parameters are applied depends on whether Inherit all variables from transformation is selected.

  • If selected:

    1. Parent transformation (Parameters tab)

    2. Transformation Executor (Parameters tab)

    3. Child transformation (Parameters tab)

    If a variable name is defined in multiple places, later entries override earlier ones.

  • If cleared:

    1. Transformation Executor (Parameters tab)

    2. Child transformation (Parameters tab)

Execution results tab

Use this tab to send execution metrics and logging information to a target step in the parent transformation. Leave a field blank to omit that metric.

Option
Description
Default

Target step for the execution results

Step in the parent transformation that receives the execution results.

N/A

Execution time (ms)

Field name for execution time.

ExecutionTime

Execution result

Field name for the execution result.

ExecutionResult

Number of errors

Field name for the error count.

ExecutionNrErrors

Number of rows read

Field name for total rows read.

ExecutionLinesRead

Number of rows written

Field name for total rows written.

ExecutionLinesWritten

Number of rows input

Field name for total input rows.

ExecutionLinesInput

Number of rows output

Field name for total output rows.

ExecutionLinesOutput

Number of rows rejected

Field name for total rows rejected.

ExecutionLinesRejected

Number of rows updated

Field name for total rows updated.

ExecutionLinesUpdated

Number of rows deleted

Field name for total rows deleted.

ExecutionLinesDeleted

Number of files retrieved

Field name for total files retrieved.

ExecutionFilesRetrieved

Exit status

Field name for exit status.

ExecutionExitStatus

Execution logging text

Field name for execution log text.

ExecutionLogText

Log channel ID

Field name for log channel ID.

ExecutionLogChannelID

Row grouping tab

Use this tab to control how input rows are grouped before the child transformation runs.

You can group by:

  • A specific number of rows

  • A specific field value (run when the value changes)

  • A specified duration (milliseconds)

To access Field to group rows on or Duration time when collecting rows, clear the default value in Number of rows to send to transformation.

Option
Description

Number of rows to send to transformation

Runs the child transformation after every N rows, passing those rows to it.

Field to group rows on

Collects rows as long as the field value stays the same. When it changes, runs the child transformation and passes the accumulated rows.

Duration time when collecting rows

Collects rows for the specified time (ms), then runs the child transformation and passes the accumulated rows.

You can also access result rows by using the Get rows from result step.

Result rows tab

Use this tab to send result rows produced by the child transformation to a target step in the parent transformation.

circle-info

To send result rows from the child transformation back to the parent transformation as output from the Transformation Executor step, you must define the layout table in this tab.

The step verifies that data types in result rows match the layout you specify. If there is a mismatch, the step fails.

Option
Description

Target step for result rows

Step in the parent transformation that receives the result rows.

Field name

Field name in the result rows.

Data type

Data type (for example, Number, Date, String).

Length

Optional field length.

Precision

Optional precision.

Result files tab

Use this tab to send result file names produced by the child transformation to a target step in the parent transformation.

Option
Description

Target step for result files information

Step in the parent transformation that receives result file information.

Result file name field

Output field name that receives the file name.

See also

Last updated

Was this helpful?