Concatenate multiple fields into one target field. The fields can be separated by a separator and the enclosure logic is completely compatible with the Text File Output step.

Copy rows to result

Job

Write rows to the executing job. The information will then be passed to the next entry in this job.

CSV file input

Input

Read from a simple CSV file input.

Data Grid

Input

Enter rows of static data in a grid, usually for testing, reference or demo purpose.

Data validator

Validation

Validates passing data based on a set of rules.

Database lookup

Lookup

Look up values in a database using field values.

Delete

Output

Permanently removes a row from a database.

Dummy (do nothing)

Flow

Does not do anything. It is useful, however, when testing things or in certain situations where you want to split streams.

Filter rows

Flow

Filter rows using simple equations.

Formula

Scripting

Calculate a formula using Pentaho's libformula.

Steps: G - L

Name

Category

Description

Generate Random Value

Input

Generate random value.

Generate rows

Input

Generate a number of empty or equal rows.

Get Repository Names

Input

List detailed information about transformations and/or jobs in a repository.

Get Rows From Result

Job

Read rows from a previous entry in a job.

Get subfolder names

Input

Read a parent folder and return all subfolders.

Get System Info

Input

Get information from the system like system date, arguments, etc.

Get Table Names

Input

Get table names from database connection and send them to the next step.

Get variables

Job

Determine the values of certain (environment or Kettle) variables and put them in field values.

Google Analytics v4

Input

Fetch data from Google Analytics account.

Group by

Statistics

Build aggregates in a group by fashion. This works only on a sorted input. If the input is not sorted, only double consecutive rows are handled correctly.

HTTP client

Lookup

Call a web service over HTTP by supplying a base URL by allowing parameters to be set dynamically.

Insert / update

Output

Update or insert rows in a database based upon keys.

Java filter

Flow

Filter rows using java code.

Job executor

Flow

Run a PDI job, and passes parameters and rows.

Join rows (cartesian product)

Joins

Output the cartesian product of the input streams. The number of rows is the multiplication of the number of rows in the input streams.

Json Input

Input

Extract relevant portions out of JSON structures (file or incoming field) and output rows.

Json Output

Output

Create JSON block and output it in a field to a file.

Steps: M - R

Name

Category

Description

Merge join

Joins

Join two streams on a given key and outputs a joined set. The input streams must be sorted on the join key.

Merge rows (diff)

Joins

Merge two streams of rows, sorted on a certain key. The two streams are compared and the equals, changed, deleted and new rows are flagged.

Microsoft Excel Input

Input

Read data from Excel and OpenOffice Workbooks (XLS, XLSX, ODS).

Microsoft Excel writer

Output

Write or appends data to an Excel file.

MongoDB input

Big Data

Read all entries from a MongoDB collection in the specified database.

MongoDB output

Big Data

Write to a MongoDB collection.

Null if

Utility

Set a field value to null if it is equal to a constant value.

Python Executor

Scripting

Map upstream data from a PDI input step or execute a Python script to generate data. When you send all rows, Python stores the dataset in a variable that kicks off your Python script.

REST client

Lookup

Consume RESTful services. REpresentational State Transfer (REST) is a key design idiom that embraces a stateless client-server architecture in which the web services are viewed as resources and can be identified by their URLs

Row Denormaliser

Transform

Denormalise rows by looking up key-value pairs and by assigning them to new fields in the output rows. This method aggregates and needs the input rows to be sorted on the grouping fields.

Row Normaliser

Transform

Normalise de-normalised information.

Steps: S - Z

Name

Category

Description

Salesforce delete

Output

Delete records in a Salesforce module.

Salesforce input

Input

Read information from Salesforce.

Salesforce insert

Output

Insert records in a Salesforce module.

Salesforce update

Output

Update records in a Salesforce module.

Salesforce upsert

Output

Insert or update records in a Salesforce module.

Select values

Transform

Select or remove fields in a row. Optionally, set the field meta-data: type, length and precision.

Set variables

Job

Set environment variables based on a single input row.

Sort rows

Transform

Sort rows based upon field values (ascending or descending).

Sorted Merge

Joins

Merge rows coming from multiple input steps providing these rows are sorted themselves on the given key fields.

Split field to rows

Transform

Split a single string field by delimiter and creates a new row for each split term.

Split fields

Transform

Split a single field into more then one.

Stream lookup

Lookup

Look up values coming from another stream in the transformation.

String operations

Transform

Apply certain operations like trimming, padding, and others to string value.

Strings cut

Transform

Cut out a snippet of a string.

Switch / case

Flow

Switch a row to a certain target step based on the case value in a field.

Table Exists

Lookup

Check if a table exists on a specified connection.

Table input

Input

Read information from a database table.

Table output

Output

Write information to a database table.

Text file input

Input

Read data from a text file in several formats. This data can then be passed to your next step(s).

Text file Output

Output

Write rows to a text file.

Transformation executor

Flow

Run a PDI transformation, sets parameters, and passes rows.

Unique Rows

Transform

Remove double rows and leave only unique occurrences. This works only on a sorted input. If the input is not sorted, only double consecutive rows are handled correctly.

Update

Output

Update data in a database table based upon keys.

Write to log

Utility

Write data to log.

Zip file

Utility

Create a standard ZIP archive from the data stream fields.

PreviousAnalyze transformation results NextWorking with jobs

Last updated 1 month ago

Was this helpful?