Transformation steps in Pipeline Designer
Steps extend and expand the functionality of transformations. You can use the following steps in Pipeline Designer.
Steps: A - F
Transform
Add sequence depending of fields value change. Each time value of at least one field change, PDI will reset sequence.
​Blocking step​
Flow
Block flow until all incoming rows have been processed. Subsequent steps only receive the last input row to this step.
​Concat fields​
Transform
Concatenate multiple fields into one target field. The fields can be separated by a separator and the enclosure logic is completely compatible with the Text File Output step.
​Copy rows to result​
Job
Write rows to the executing job. The information will then be passed to the next entry in this job.
​Data Grid​
Input
Enter rows of static data in a grid, usually for testing, reference or demo purpose.
​Dummy (do nothing)​
Flow
Does not do anything. It is useful, however, when testing things or in certain situations where you want to split streams.
Steps: G - L
​Get Repository Names​
Input
List detailed information about transformations and/or jobs in a repository.
​Get variables​
Job
Determine the values of certain (environment or Kettle) variables and put them in field values.
​Group by​
Statistics
Build aggregates in a group by fashion. This works only on a sorted input. If the input is not sorted, only double consecutive rows are handled correctly.
​HTTP client​
Lookup
Call a web service over HTTP by supplying a base URL by allowing parameters to be set dynamically.
Joins
Output the cartesian product of the input streams. The number of rows is the multiplication of the number of rows in the input streams.
​Json Input​
Input
Extract relevant portions out of JSON structures (file or incoming field) and output rows.
Steps: M - R
​Merge join​
Joins
Join two streams on a given key and outputs a joined set. The input streams must be sorted on the join key.
​Merge rows (diff)​
Joins
Merge two streams of rows, sorted on a certain key. The two streams are compared and the equals, changed, deleted and new rows are flagged.
​Python Executor​
Scripting
Map upstream data from a PDI input step or execute a Python script to generate data. When you send all rows, Python stores the dataset in a variable that kicks off your Python script.
​REST client​
Lookup
Consume RESTful services. REpresentational State Transfer (REST) is a key design idiom that embraces a stateless client-server architecture in which the web services are viewed as resources and can be identified by their URLs
​Row Denormaliser​
Transform
Denormalise rows by looking up key-value pairs and by assigning them to new fields in the output rows. This method aggregates and needs the input rows to be sorted on the grouping fields.
Steps: S - Z
​Select values​
Transform
Select or remove fields in a row. Optionally, set the field meta-data: type, length and precision.
​Sorted Merge​
Joins
Merge rows coming from multiple input steps providing these rows are sorted themselves on the given key fields.
Transform
Split a single string field by delimiter and creates a new row for each split term
Last updated
Was this helpful?

