Transformation steps in Pipeline Designer
Steps extend and expand the functionality of transformations. You can use the following steps in Pipeline Designer.
Steps: A - F
Transform
Add sequence depending of fields value change. Each time value of at least one field change, PDI will reset sequence.
Flow
Block flow until all incoming rows have been processed. Subsequent steps only receive the last input row to this step.
Transform
Concatenate multiple fields into one target field. The fields can be separated by a separator and the enclosure logic is completely compatible with the Text File Output step.
Job
Write rows to the executing job. The information will then be passed to the next entry in this job.
Flow
Does not do anything. It is useful, however, when testing things or in certain situations where you want to split streams.
Steps: G - L
Input
List detailed information about transformations and/or jobs in a repository.
Job
Determine the values of certain (environment or Kettle) variables and put them in field values.
Statistics
Build aggregates in a group by fashion. This works only on a sorted input. If the input is not sorted, only double consecutive rows are handled correctly.
Lookup
Call a web service over HTTP by supplying a base URL by allowing parameters to be set dynamically.
Joins
Output the cartesian product of the input streams. The number of rows is the multiplication of the number of rows in the input streams.
Input
Extract relevant portions out of JSON structures (file or incoming field) and output rows.
Steps: M - R
Joins
Join two streams on a given key and outputs a joined set. The input streams must be sorted on the join key.
Joins
Merge two streams of rows, sorted on a certain key. The two streams are compared and the equals, changed, deleted and new rows are flagged.
Scripting
Map upstream data from a PDI input step or execute a Python script to generate data. When you send all rows, Python stores the dataset in a variable that kicks off your Python script.
Lookup
Consume RESTful services. REpresentational State Transfer (REST) is a key design idiom that embraces a stateless client-server architecture in which the web services are viewed as resources and can be identified by their URLs
Transform
Denormalise rows by looking up key-value pairs and by assigning them to new fields in the output rows. This method aggregates and needs the input rows to be sorted on the grouping fields.
Steps: S - Z
Transform
Select or remove fields in a row. Optionally, set the field meta-data: type, length and precision.
Joins
Merge rows coming from multiple input steps providing these rows are sorted themselves on the given key fields.
Transform
Split a single string field by delimiter and creates a new row for each split term.
Transform
Apply certain operations like trimming, padding, and others to string value.
Input
Read data from a text file in several formats. This data can then be passed to your next step(s).
Transform
Remove double rows and leave only unique occurrences. This works only on a sorted input. If the input is not sorted, only double consecutive rows are handled correctly.
Last updated
Was this helpful?

