# PDI transformation steps

Steps extend and expand the functionality of Pentaho Data Integration (PDI) transformations. You can use the following steps in PDI.

## Steps: A - F

| Name                                                                                                                                                                                                                            | Category       | Description                                                                                                                                                                                                                   |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [Abort](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/abort)                                                                                                                       | Flow           | Abort a transformation.                                                                                                                                                                                                       |
| [Add a Checksum](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/add-a-checksum)                                                                                                     | Transform      | Add a checksum column for each input row.                                                                                                                                                                                     |
| [Add constants](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558172/Add+Constants)                                                                                                                          | Transform      | Add one or more constants to the input rows.                                                                                                                                                                                  |
| [Add sequence](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/add-sequence-step-article)                                                                                            | Transform      | Get the next value from a sequence.                                                                                                                                                                                           |
| [Add value fields changing sequence](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386799997/Add+value+fields+changing+sequence)                                                                                | Transform      | Add sequence depending of fields value change. Each time value of at least one field change, PDI will reset sequence.                                                                                                         |
| [Add XML](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/370967522/Add+XML)                                                                                                                                      | Transform      | Encode several fields into an XML fragment.                                                                                                                                                                                   |
| [AMQP Consumer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/amqp-consumer)                                                                                                       | Streaming      | Pull streaming data from an AMQP broker or clients through an AMQP transformation.                                                                                                                                            |
| [AMQP Producer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/amqp-producer)                                                                                                       | Streaming      | Publish messages in near-real-time to an AMQP broker.                                                                                                                                                                         |
| [Analytic query](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/375133731/Analytic+Query)                                                                                                                        | Statistics     | Execute analytic queries over a sorted dataset (LEAD/LAG/FIRST/LAST).                                                                                                                                                         |
| [Annotate stream](https://docs.pentaho.com/pdia-data-integration/extracting-data-into-pdi/work-with-the-streamlined-data-refinery/use-the-streamlined-data-refinery/building-blocks-for-the-sdr/using-the-annotate-stream-step) | Flow           | Refine your data for the Streamlined Data Refinery by creating measures, link dimensions, or attributes on stream fields.                                                                                                     |
| [Append streams](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081851/Append+streams)                                                                                                                        | Flow           | Append two streams in an ordered way.                                                                                                                                                                                         |
| [ARFF output](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311060/ARFF+Output)                                                                                                                              | Data Mining    | Write data in ARFF format to a file.                                                                                                                                                                                          |
| [Automatic Documentation Output](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386793665/Automatic+Documentation+Output)                                                                                        | Output         | Generate documentation automatically based on input in the form of a list of transformations and jobs.                                                                                                                        |
| [Avro Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/avro-input)                                                                                                             | Big Data       | Decode binary or JSON Avro data and extracts fields from the structure it defines, either from flat files or incoming fields.                                                                                                 |
| Avro input (deprecated)                                                                                                                                                                                                         | Deprecated     | Replaced by [Avro Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/avro-input).                                                                                              |
| [Avro Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/avro-output)                                                                                                           | Big Data       | Serialize data into Avro binary or JSON format from the PDI data stream, then writes it to file.                                                                                                                              |
| [Block this step until steps finish](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386799999/Block+this+step+until+steps+finish)                                                                                | Flow           | Block this step until selected steps finish.                                                                                                                                                                                  |
| [Blocking step](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558183/Blocking+step)                                                                                                                          | Flow           | Block flow until all incoming rows have been processed. Subsequent steps only receive the last input row to this step.                                                                                                        |
| [Calculator](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/calculator)                                                                                                             | Transform      | Create new fields by performing simple calculations.                                                                                                                                                                          |
| [Call DB Procedure](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558140/Call+DB+Procedure)                                                                                                                  | Lookup         | Get back information by calling a database procedure.                                                                                                                                                                         |
| [Call Endpoint](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388317884/Call+Endpoint)                                                                                                                          | Pentaho Server | Call API endpoints from the Pentaho Server within a PDI transformation.                                                                                                                                                       |
| Cassandra Input                                                                                                                                                                                                                 | Deprecated     | No longer a part of the PDI disctribution. Contact [Pentaho Support](https://support.pentaho.com/) for details.                                                                                                               |
| Cassandra Output                                                                                                                                                                                                                | Deprecated     | No longer a part of the PDI disctribution. Contact [Pentaho Support](https://support.pentaho.com/) for details.                                                                                                               |
| Catalog Input                                                                                                                                                                                                                   | Deprecated     | Read CSV text file formats of a Pentaho Data Catalog resource that is stored in a Hadoop Distributed File System ( HDFS) or Amazon S3 ecosystem, and then output the data as table rows that can be used by a transformation. |
| Catalog Output                                                                                                                                                                                                                  | Deprecated     | Encode CSV text file formats by using the schema that is defined in PDI to create or replace a data resource in Pentaho Data Catalogand add metadata to the data resource.                                                    |
| [Change file encoding](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386800011/Change+file+encoding)                                                                                                            | Utility        | Change file encoding and create a new file.                                                                                                                                                                                   |
| [Check if a column exists](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372703368/Check+if+a+column+exists)                                                                                                    | Lookup         | Check if a column exists in a table on a specified connection.                                                                                                                                                                |
| [Check if file is locked](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386800024/Check+if+file+is+locked)                                                                                                      | Lookup         | Check if a file is locked by another process.                                                                                                                                                                                 |
| [Check if webservice is available](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386793916/Check+if+webservice+is+available)                                                                                    | Lookup         | Check if a webservice is available.                                                                                                                                                                                           |
| [Clone row](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081860/Clone+row)                                                                                                                                  | Utility        | Clone a row as many times as needed.                                                                                                                                                                                          |
| [Closure Generator](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/364316814/Closure+Generator)                                                                                                                  | Transform      | Generate a closure table using parent-child relationships.                                                                                                                                                                    |
| [Combination lookup/update](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558225/Combination+lookup-update)                                                                                                  | Data Warehouse | Update a junk dimension in a data warehouse. Alternatively, look up information in this dimension. The primary key of a junk dimension are all the fields.                                                                    |
| [Concat Fields](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386803438/Concat+Fields)                                                                                                                          | Transform      | Concatenate multiple fields into one target field. The fields can be separated by a separator and the enclosure logic is completely compatible with the Text File Output step.                                                |
| [Copybook Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/copybook-input-pdi-step)                                                                                            | Discovery      | Reads binary data files that are mapped by a fixed-length COBOL copybook definition file.                                                                                                                                     |
| [Copy rows to result](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558228/Copy+rows+to+result)                                                                                                              | Job            | Write rows to the executing job. The information will then be passed to the next entry in this job.                                                                                                                           |
| [CouchDB Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/couchdb-input)                                                                                                       | Big Data       | Retrieve all documents from a given view in a given design document from a given database.                                                                                                                                    |
| [Credit card validator](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386800027/Credit+card+validator)                                                                                                          | Validation     | Determines if a credit card number is valid (uses LUHN10 (MOD-10) algorithm), and which credit card vendor handles that number (VISA, MasterCard, Diners Club, EnRoute, American Express (AMEX),...).                         |
| [CSV File Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/csv-file-input)                                                                                                     | Input          | Read from a simple CSV file input.                                                                                                                                                                                            |
| [Data Grid](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386800034/Data+Grid)                                                                                                                                  | Input          | Enter rows of static data in a grid, usually for testing, reference or demo purpose.                                                                                                                                          |
| [Data Validator](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/367624189/Data+Validator)                                                                                                                        | Validation     | Validates passing data based on a set of rules.                                                                                                                                                                               |
| [Database join](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558192/Database+Join)                                                                                                                          | Lookup         | Execute a database query using stream values as parameters.                                                                                                                                                                   |
| [Database lookup](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558138/Database+lookup)                                                                                                                      | Lookup         | Look up values in a database using field values.                                                                                                                                                                              |
| [De-serialize from file](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558119/De-serialize+from+file)                                                                                                        | Input          | Read rows of data from a data cube.                                                                                                                                                                                           |
| [Delay row](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081863/Delay+row)                                                                                                                                  | Utility        | Output each input row after a delay.                                                                                                                                                                                          |
| [Delete](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/delete-step-pdi)                                                                                                            | Output         | Permanently removes a row from a database.                                                                                                                                                                                    |
| [Detect empty stream](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386800030/Detect+empty+stream)                                                                                                              | Flow           | Output one empty row if input stream is empty (I.e. when input stream does not contain any row).                                                                                                                              |
| [Discover metadata from a text file](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/discover-metadata-from-a-text-file)                                                             | Input          | Determines the structure of delimited text files.                                                                                                                                                                             |
| [Dimension lookup/update](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558220/Dimension+Lookup-Update)                                                                                                      | Data Warehouse | Update a slowly changing dimension in a data warehouse. Alternatively, look up information in this dimension.                                                                                                                 |
| [Dummy (do nothing)](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558152/Dummy+do+nothing)                                                                                                                  | Flow           | Does not do anything. It is useful, however, when testing things or in certain situations where you want to split streams.                                                                                                    |
| [Dynamic SQL row](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/376444048/Dynamic+SQL+row)                                                                                                                      | Lookup         | Execute dynamic SQL statement build in a previous field.                                                                                                                                                                      |
| [Edi to XML](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386798790/Edi+to+XML)                                                                                                                                | Utility        | Convert an Edifact message to XML to simplify data extraction.                                                                                                                                                                |
| ElasticSearch Bulk Insert (deprecated)                                                                                                                                                                                          | Deprecated     | Replaced by [Elasticsearch REST bulk insert](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/elasticsearch-rest-bulk-insert).                                                      |
| [Elasticsearch REST bulk insert](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/elasticsearch-rest-bulk-insert)                                                                     | Bulk loading   | Perform bulk inserts into Elasticsearch.                                                                                                                                                                                      |
| [Email messages input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311043/Email+Messages+Input)                                                                                                            | Input          | Read POP3/IMAP server and retrieve messages.                                                                                                                                                                                  |
| [ESRI Shapefile Reader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311488/ESRI+Shapefile+Reader)                                                                                                          | Input          | Read shape file data from an ESRI shape file and linked DBF file.                                                                                                                                                             |
| [ETL metadata injection](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/etl-metadata-injection)                                                                                     | Flow           | Inject metadata into an existing transformation prior to execution. This allows for the creation of dynamic and highly flexible data integration solutions.                                                                   |
| Example step (deprecated)                                                                                                                                                                                                       | Deprecated     | Is an example of a plugin test step.                                                                                                                                                                                          |
| [Execute a process](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/376442106/Execute+a+process)                                                                                                                  | Utility        | Execute a process and return the result.                                                                                                                                                                                      |
| [Execute Row SQL Script](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/execute-row-sql-script-cp)                                                                                  | Scripting      | Execute an SQL statement or file for every input row.                                                                                                                                                                         |
| [Execute SQL Script](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/execute-sql-script-cp)                                                                                          | Scripting      | Execute an SQL script, optionally parameterized using input rows.                                                                                                                                                             |
| [Extract to rows](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/extract-to-rows)                                                                                                   | Input          | Parses hierarchical data type fields coming from a previous step.                                                                                                                                                             |
| [File exists (Step)](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/file-exists-step)                                                                                               | Lookup         | Check if a file exists.                                                                                                                                                                                                       |
| [Filter Rows](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558145/Filter+Rows)                                                                                                                              | Flow           | Filter rows using simple equations.                                                                                                                                                                                           |
| [Fixed file input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558257/Fixed+File+Input)                                                                                                                    | Input          | Read from a fixed file input.                                                                                                                                                                                                 |
| [Formula](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/376442115/Formula)                                                                                                                                      | Scripting      | Calculate a formula using Pentaho's libformula.                                                                                                                                                                               |
| [Fuzzy match](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388310481/Fuzzy+match)                                                                                                                              | Lookup         | Find the approximate matches to a string using matching algorithms. Read a field from a main stream and output approximative value from lookup stream.                                                                        |

## Steps: G - L

| Name                                                                                                                                                   | Category       | Description                                                                                                                                                                     |
| ------------------------------------------------------------------------------------------------------------------------------------------------------ | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [Generate random credit card numbers](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386800040/Generate+random+credit+card+numbers)     | Input          | Generate random valid (luhn check) credit card numbers.                                                                                                                         |
| [Generate random value](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558100/Generate+Random+Value)                                 | Input          | Generate random value.                                                                                                                                                          |
| [Generate Rows](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/generate-rows)                              | Input          | Generate a number of empty or equal rows.                                                                                                                                       |
| [Get data from XML](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081230/Get+Data+From+XML)                                         | Input          | Get data from XML file by using XPath. This step also allows you to parse XML defined in a previous field.                                                                      |
| [Get File Names](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558124/Get+File+Names)                                               | Input          | Get file names from the operating system and send them to the next step.                                                                                                        |
| [Get files from result](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558238/Get+files+from+result)                                 | Job            | Read filenames used or generated in a previous entry in a job.                                                                                                                  |
| [Get Files Rows Count](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558265/Get+Files+Rows+Count)                                   | Input          | Get files rows count.                                                                                                                                                           |
| [Get ID from slave server](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/385450958/Get+ID+from+Slave+Server)                           | Transform      | Retrieve unique IDs in blocks from a slave server. The referenced sequence needs to be configured on the slave server in the XML configuration file.                            |
| [Get records from stream](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/get-records-from-stream)          | Streaming      | Return records that were previously generated by another transformation in a job.                                                                                               |
| [Get repository names](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386800043/Get+repository+names)                                   | Input          | List detailed information about transformations and/or jobs in a repository.                                                                                                    |
| [Get rows from result](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/get-rows-from-result)                | Job            | Read rows from a previous entry in a job.                                                                                                                                       |
| [Get Session Variables](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388317880/Get+Session+Variables)                                 | Pentaho Server | Retrieve the value of a session variable.                                                                                                                                       |
| [Get SubFolder names](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311493/Get+SubFolder+names)                                     | Input          | Read a parent folder and return all subfolders.                                                                                                                                 |
| [Get System Info](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/get-system-info)                          | Input          | Get information from the system like system date, arguments, etc.                                                                                                               |
| [Get table names](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311496/Get+table+names)                                             | Input          | Get table names from database connection and send them to the next step.                                                                                                        |
| [Get Variables](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558237/Get+Variable)                                                  | Job            | Determine the values of certain (environment or Kettle) variables and put them in field values.                                                                                 |
| Google Analytics (deprecated                                                                                                                           | Deprecated     | Fetch data from google analytics account. Replacement step is Google Analytics v4. **Note:** This step will only work for Universal Analytics 360 customers until July 1, 2024. |
| [Google Analytics v4](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/google-analytics-v4)                  | Input          | Fetch data from Google Analytics account.                                                                                                                                       |
| Greenplum Bulk Loader (deprecated)                                                                                                                     | Deprecated     | Bulk load Greenplum data. Replacement step is [Greenplum Load](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/384370894/Greenplum+Load).                         |
| [Greenplum Load](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/384370894/Greenplum+Load)                                               | Bulk loading   | Bulk load Greenplum data.                                                                                                                                                       |
| [Group By](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/group-by-landing-page-article)                   | Statistics     | Build aggregates in a group by fashion. This works only on a sorted input. If the input is not sorted, only double consecutive rows are handled correctly.                      |
| [GZIP CSV Input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/377094151/GZIP+CSV+Input)                                               | Input          | Read in parallel from a GZIP CSV file.                                                                                                                                          |
| [Hadoop File Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/hadoop-file-input-cp-main-page)         | Big Data       | Read data from a variety of different text-file types stored on a Hadoop cluster.                                                                                               |
| [Hadoop File Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/hadoop-file-output-cp-main-page)       | Big Data       | Write data to a variety of different text-file types stored on a Hadoop cluster.                                                                                                |
| [HBase Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/hbase-input-cp-main-page)                     | Big Data       | Read from an HBase column family.                                                                                                                                               |
| [HBase Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/hbase-output)                                | Big Data       | Write to an HBase column family.                                                                                                                                                |
| [HBase row decoder](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/hbase-row-decoder-pdi)                  | Big Data       | Decodes an incoming key and HBase result object to a mapping.                                                                                                                   |
| [Hierarchical JSON input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/hierarchical-json-input)          | Input          | Load JSON data into PDI from a previous step or from a file.                                                                                                                    |
| [Hierarchical JSON Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/hierarchical-json-output)        | Output         | Converts hierarchical data from a previous step into JSON format.                                                                                                               |
| [HL7 Input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386798456/HL7+Input)                                                         | Input          | Read data from HL7 data streams.                                                                                                                                                |
| [HTTP client](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558141/HTTP+Client)                                                     | Lookup         | Call a web service over HTTP by supplying a base URL by allowing parameters to be set dynamically.                                                                              |
| [HTTP Post](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/376441850/HTTP+Post)                                                         | Lookup         | Call a web service request over HTTP by supplying a base URL by allowing parameters to be set dynamically.                                                                      |
| [IBM WebSphere MQ Consumer (deprecated)](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388310162/IBM+Websphere+MQ+Consumer+Deprecated) | Deprecated     | Receive messages from any IBM WebSphere MQ Server.                                                                                                                              |
| [IBM WebSphere MQ Producer (deprecated)](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388310166/IBM+Websphere+MQ+Producer+Deprecated) | Deprecated     | Send messages to any IBM WebSphere MQ Server.                                                                                                                                   |
| [Identify last row in a stream](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388310476/Identify+last+row+in+a+stream)                 | Flow           | Mark the last row.                                                                                                                                                              |
| [If field value is null](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/376442127/If+field+value+is+null)                               | Utility        | Set a field value to a constant if it is null.                                                                                                                                  |
| [Infobright Loader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386800365/Infobright+Loader)                                         | Bulk loading   | Load data to an Infobright database table.                                                                                                                                      |
| [Ingres VectorWise Bulk Loader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/385450759/Ingres+VectorWise+Bulk+Loader)                 | Bulk loading   | Interface with the Ingres VectorWise Bulk Loader "COPY TABLE" command.                                                                                                          |
| [Injector](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558241/Injector)                                                           | Inline         | Inject rows into the transformation through the java API.                                                                                                                       |
| [Insert / Update](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558126/Insert+-+Update)                                             | Output         | Update or insert rows in a database based upon keys.                                                                                                                            |
| [Java Filter](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/java-filter-pdi-step)                         | Flow           | Filter rows using java code.                                                                                                                                                    |
| [JMS Consumer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/jms-consumer)                                | Streaming      | Receive messages from a JMS server.                                                                                                                                             |
| [JMS consumer (deprecated)](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388310137/JMS+Consumer+Deprecated)                           | Deprecated     | Replaced by [JMS Consumer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/jms-consumer).                                            |
| [JMS Producer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/jms-producer)                                | Streaming      | Send messages to a JMS server.                                                                                                                                                  |
| [JMS producer (deprecated)](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388310145/JMS+Producer+Deprecated)                           | Deprecated     | Replaced by [JMS Producer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/jms-producer).                                            |
| [Job Executor](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/job-executor)                                | Flow           | Run a PDI job, and passes parameters and rows.                                                                                                                                  |
| [Join Rows (cartesian product)](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558187/Join+Rows+Cartesian+product)                   | Joins          | Output the cartesian product of the input streams. The number of rows is the multiplication of the number of rows in the input streams.                                         |
| [JSON Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/json-input)                                    | Input          | Extract relevant portions out of JSON structures (file or incoming field) and output rows.                                                                                      |
| [JSON output](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388309851/JSON+output)                                                     | Output         | Create JSON block and output it in a field to a file.                                                                                                                           |
| [Kafka consumer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/kafka-consumer)                            | Streaming      | Run a sub-transformation that executes according to message batch size or duration, letting you process a continuous stream of records in near-real-time.                       |
| [Kafka Producer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/kafka-producer)                            | Streaming      | Publish messages in near-real-time across worker nodes where multiple, subscribed members have access.                                                                          |
| [Kinesis Consumer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/kinesis-consumer)                        | Streaming      | Extract data from a specific stream located within the Amazon Kinesis Data Streams service.                                                                                     |
| [Kinesis Producer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/kinesis-producer)                        | Streaming      | Push data to an existing region and stream located within the Amazon Kinesis Data Streams service.                                                                              |
| [Knowledge Flow](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311062/Knowledge+Flow)                                               | Data Mining    | Executes a Knowledge Flow data mining process.                                                                                                                                  |
| [LDAP Input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558260/LDAP+Input)                                                       | Input          | Read data from LDAP host.                                                                                                                                                       |
| [LDAP Output](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311403/LDAP+Output)                                                     | Output         | Perform Insert, upsert, update, add or delete operations on records based on their DN (Distinguished Name).                                                                     |
| [LDIF Input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081239/LDIF+Input)                                                       | Input          | Read data from LDIF files.                                                                                                                                                      |
| [Load file content in memory](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311567/Load+file+content+in+memory)                     | Input          | Load file content in memory.                                                                                                                                                    |
| LucidDB streaming loader (deprecated)                                                                                                                  | Deprecated     | Load data into LucidDB by using Remote Rows UDX.                                                                                                                                |

\## Steps: M - R

| Name                                                                                                                                                                | Category           | Description                                                                                                                                                                                                                       |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [Mail](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386793725/Mail+step)                                                                           | Utility            | Send e-mail.                                                                                                                                                                                                                      |
| [Mail Validator](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081634/Mail+Validator)                                                            | Validation         | Check if an email address is valid.                                                                                                                                                                                               |
| [Mapping](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/mapping)                                                       | Mapping            | Run a mapping (sub-transformation), use MappingInput and MappingOutput to specify the fields interface.                                                                                                                           |
| [Mapping Input Specification](https://docs.pentaho.com/pdia-data-integration/broken-reference)                                                                      | Mapping            | Specify the input interface of a mapping.                                                                                                                                                                                         |
| [Mapping Output Specification](https://docs.pentaho.com/pdia-data-integration/broken-reference)                                                                     | Mapping            | Specify the output interface of a mapping.                                                                                                                                                                                        |
| [MapReduce Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/mapreduce-input)                                       | Big Data           | Enter Key Value pairs from Hadoop MapReduce.                                                                                                                                                                                      |
| [MapReduce Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/mapreduce-output)                                     | Big Data           | Exit Key Value pairs, then push into Hadoop MapReduce.                                                                                                                                                                            |
| [MaxMind GeoIP Lookup](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/364316813/Pentaho+Data+Integration+Steps)                                      | Lookup             | Lookup an IPv4 address in a MaxMind database and add fields such as geography, ISP, or organization.                                                                                                                              |
| [Memory Group By](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/memory-group-by)                                       | Statistics         | Build aggregates in a group by fashion. This step doesn't require sorted input.                                                                                                                                                   |
| [Merge Join](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558201/Merge+Join)                                                                    | Joins              | Join two streams on a given key and outputs a joined set. The input streams must be sorted on the join key.                                                                                                                       |
| [Merge Rows (diff)](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/merge-rows-diff)                                     | Joins              | Merge two streams of rows, sorted on a certain key. The two streams are compared and the equals, changed, deleted and new rows are flagged.                                                                                       |
| [Metadata structure of stream](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386794485/Metadata+Structure+of+Stream)                                | Utility            | Read the metadata of the incoming stream.                                                                                                                                                                                         |
| [Microsoft Access Input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558259/Access+Input)                                                      | Input              | Read data from a Microsoft Access file                                                                                                                                                                                            |
| [Microsoft Access Output](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558136/Access+Output)                                                    | Output             | Store records into an MS-Access database table.                                                                                                                                                                                   |
| [Microsoft Excel Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/microsoft-excel-input)                           | Input              | Read data from Excel and OpenOffice Workbooks (XLS, XLSX, ODS).                                                                                                                                                                   |
| Microsoft Excel Output (deprecated)                                                                                                                                 | Deprecated         | Replaced by [Microsoft Excel writer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/microsoft-excel-writer).                                                                          |
| [Microsoft Excel writer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/microsoft-excel-writer)                         | Output             | Write or appends data to an Excel file.                                                                                                                                                                                           |
| [Modified Java Script Value](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/modified-java-script-value)                 | Scripting          | Run JavaScript programs (and much more).                                                                                                                                                                                          |
| [Modify values from a single row](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/modify-values-from-a-single-row)       | Input              | Build complex hierarchical data.                                                                                                                                                                                                  |
| [Modify values from grouped rows](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/modify-values-from-grouped-rows)       | Input              | Modifies hierarchical data to form nested JSON key-value pairs.                                                                                                                                                                   |
| [Mondrian Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/mondrian-input-pdi-transformation-step-cp)              | Input              | Execute and retrieve data using an MDX query against a Pentaho Analyses OLAP server (Mondrian).                                                                                                                                   |
| [MonetDB Bulk Loader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386803510/MonetDB+bulk+loader)                                                  | Bulk loading       | Load data into MonetDB by using their bulk load command in streaming mode.                                                                                                                                                        |
| [MongoDB Execute](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/mongodb-execute)                                       | Big Data           | Connects to a MongoDB cluster and executes Mongo shell-style commands.                                                                                                                                                            |
| [MongoDB Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/mongodb-input)                                           | Big Data           | Read all entries from a MongoDB collection in the specified database.                                                                                                                                                             |
| [MongoDB Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/mongodb-output)                                         | Big Data           | Write to a MongoDB collection.                                                                                                                                                                                                    |
| [MQTT Consumer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/mqtt-consumer)                                           | Streaming          | Pull streaming data from an MQTT broker or clients through an MQTT transformation.                                                                                                                                                |
| [MQTT Producer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/mqtt-producer)                                           | Streaming          | Publish messages in near-real-time to an MQTT broker.                                                                                                                                                                             |
| [Multiway Merge Join](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388303826/Multiway+Merge+Join)                                                  | Joins              | Join multiple streams. This step supports INNER and FULL OUTER joins.                                                                                                                                                             |
| [MySQL Bulk Loader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/378110379/MySQL+Bulk+Loader)                                                      | Bulk loading       | Load data over a named pipe (not available on MS Windows).                                                                                                                                                                        |
| [Null if...](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558164/Null+If)                                                                       | Utility            | Set a field value to null if it is equal to a constant value.                                                                                                                                                                     |
| [Number range](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388310492/Number+range)                                                                | Transform          | Create ranges based on numeric field.                                                                                                                                                                                             |
| [OLAP Input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/384370419/OLAP+Input)                                                                    | Input              | Execute and retrieve data using an MDX query against any XML/A OLAP datasource using olap4j.                                                                                                                                      |
| OpenERP object delete (deprecated)                                                                                                                                  | Deprecated         | Delete data from the OpenERP server using the XMLRPC interface with the 'unlink' function.                                                                                                                                        |
| OpenERP object input (deprecated)                                                                                                                                   | Deprecated         | Retrieve data from the OpenERP server using the XMLRPC interface with the 'read' function.                                                                                                                                        |
| OpenERP object output (deprecated)                                                                                                                                  | Deprecated         | Update data on the OpenERP server using the XMLRPC interface and the 'import' function                                                                                                                                            |
| [Oracle Bulk Loader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558251/Oracle+Bulk+Loader)                                                    | Bulk loading       | Use Oracle Bulk Loader to load data.                                                                                                                                                                                              |
| [ORC Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/orc-input)                                                   | Big Data           | Read fields data from ORC files into a PDI data stream.                                                                                                                                                                           |
| [ORC Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/orc-output)                                                 | Big Data           | Serialize data from the PDI data stream into an ORC file format and writes it to a file.                                                                                                                                          |
| [Output steps metrics](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311428/Output+Steps+Metrics)                                                | Statistics         | Return metrics for one or several steps.                                                                                                                                                                                          |
| Palo cell input (deprecated)                                                                                                                                        | Deprecated         | Retrieve all cell data from a Palo cube.                                                                                                                                                                                          |
| Palo cell output (deprecated)                                                                                                                                       | Deprecated         | Update cell data in a Palo cube.                                                                                                                                                                                                  |
| Palo dim input (deprecated)                                                                                                                                         | Deprecated         | Return elements from a dimension in a Palo database.                                                                                                                                                                              |
| Palo dim output (deprecated)                                                                                                                                        | Deprecated         | Create/update dimension elements and element consolidations in a Palo database.                                                                                                                                                   |
| [Parquet Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/parquet-input)                                           | Big Data           | Decode Parquet data formats and extracts fields from the structure it defines.                                                                                                                                                    |
| [Parquet Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/parquet-output)                                         | Big Data           | Map fields within data files and choose where you want to process those files.                                                                                                                                                    |
| [Pentaho Reporting Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/pentaho-reporting-output)                     | Output             | Execute an existing report file (.prpt).                                                                                                                                                                                          |
| [PostgreSQL Bulk Loader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372704366/PostgreSQL+Bulk+Loader)                                            | Bulk loading       | Bulk load PostgreSQL data.                                                                                                                                                                                                        |
| [Prioritize streams](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388309676/Prioritize+streams)                                                    | Flow               | Prioritize streams in an order way.                                                                                                                                                                                               |
| [Process files](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/376442129/Process+files)                                                              | Utility            | Process one file per row (copy or move or delete). This step only accept filename in input.                                                                                                                                       |
| [Properties Output](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388310203/Properties+Output)                                                      | Output             | Write data to properties file.                                                                                                                                                                                                    |
| [Property Input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081241/Property+Input)                                                            | Input              | Read data (key, value) from properties files.                                                                                                                                                                                     |
| [Python Executor](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/python-executor)                                       | Scripting          | Map upstream data from a PDI input step or execute a Python script to generate data. When you send all rows, Python stores the dataset in a variable that kicks off your Python script.                                           |
| [Query metadata from a database](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/query-metadata-from-a-database-article) | Metadata Discovery | Retrieves metadata from a database connection.                                                                                                                                                                                    |
| [Query HCP](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/query-hcp)                                                   | Input              | Uses the Metadata Query Engine (MQE) to query your Hitachi Content Platform (HCP) repository for objects, their URLs, and system metadata properties.                                                                             |
| [R script executor](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311468/R+script+executor)                                                      | Statistics         | Execute an R script within a PDI transformation.                                                                                                                                                                                  |
| [Read metadata from HCP](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/read-metadata-from-hcp)                         | Input              | Identifies an HCP object by its URL path then specifies a target annotation name to read.                                                                                                                                         |
| [Read metadata from Copybook](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/read-metadata-from-copybook)               | Metadata Discovery | Reads a binary fixed-length copybook definition file and outputs the file and column descriptor information as fields to PDI rows.                                                                                                |
| Read metadata                                                                                                                                                       | Deprecated         | Search and retrieve metadata in the Pentaho Data Catalog that is associated with specific data resources that are registered in Data Catalog.                                                                                     |
| [Regex Evaluation](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/regex-evaluation)                                     | Scripting          | Evaluate regular expressions. This step uses a regular expression to evaluate a field. It can also extract new fields out of an existing field with capturing groups.                                                             |
| [Replace in String](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/replace-in-string)                                   | Transform          | Replace all occurrences a word in a string with another word.                                                                                                                                                                     |
| [Reservoir Sampling](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388310819/Reservoir+Sampling)                                                    | Statistics         | Transform Samples a fixed number of rows from the incoming stream.                                                                                                                                                                |
| [REST client step](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/rest-client-step)                                     | Lookup             | Consume RESTful services. REpresentational State Transfer (REST) is a key design idiom that embraces a stateless client-server architecture in which the web services are viewed as resources and can be identified by their URLs |
| [Row Denormaliser](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/row-denormaliser)                                     | Transform          | Denormalise rows by looking up key-value pairs and by assigning them to new fields in the output rows. This method aggregates and needs the input rows to be sorted on the grouping fields.                                       |
| [Row Flattener](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/row-flattener)                                           | Transform          | Flatten consecutive rows based on the order in which they appear in the input stream.                                                                                                                                             |
| [Row Normaliser](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/row-normaliser)                                         | Transform          | Normalise de-normalised information.                                                                                                                                                                                              |
| [RSS Input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/376443633/RSS+Input)                                                                      | Input              | Read RSS feeds.                                                                                                                                                                                                                   |
| [RSS Output](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311425/RSS+Output)                                                                    | Output             | Read RSS stream.                                                                                                                                                                                                                  |
| [Rule Executor](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386794190/Rule+Executor)                                                              | Scripting          | Execute a rule against each row (using Drools).                                                                                                                                                                                   |
| [Rule Accumulator](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386794192/Rule+Accumulator)                                                        | Scripting          | Execute a rule against a set of rows (using Drools).                                                                                                                                                                              |
| [Run SSH commands](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386795027/Run+SSH+commands)                                                        | Utility            | Run SSH commands and returns result.                                                                                                                                                                                              |

## Steps: S - Z

| Name                                                                                                                                                                                                                                      | Category       | Description                                                                                                                                                              |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| [S3 CSV Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/s3-csv-input-cp)                                                                                                                | Input          | Read from an S3 CSV file.                                                                                                                                                |
| [S3 File Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/s3-file-output-cp)                                                                                                            | Output         | Export data to a text file on an Amazon Simple Storage Service (S3).                                                                                                     |
| [Salesforce bulk operation](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/salesforce-bulk-operation)                                                                                         | Bulk loading   | Perform bulk operations on Salesforce objects                                                                                                                            |
| [Salesforce Delete](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/salesforce-delete)                                                                                                         | Output         | Delete records in a Salesforce module.                                                                                                                                   |
| [Salesforce Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/salesforce-input)                                                                                                           | Input          | Read information from Salesforce.                                                                                                                                        |
| [Salesforce Insert](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/salesforce-insert)                                                                                                         | Output         | Insert records in a Salesforce module.                                                                                                                                   |
| [Salesforce Update](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/salesforce-update)                                                                                                         | Output         | Update records in a Salesforce module.                                                                                                                                   |
| [Salesforce Upsert](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/salesforce-upsert)                                                                                                         | Output         | Insert or update records in a Salesforce module.                                                                                                                         |
| [Sample rows](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386800015/Sample+rows)                                                                                                                                        | Statistics     | Filter rows based on the line number.                                                                                                                                    |
| SAP input (deprecated)                                                                                                                                                                                                                    | Deprecated     | Read data from SAP ERP, optionally with parameters.                                                                                                                      |
| [SAS Input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386795310/SAS+Input)                                                                                                                                            | Input          | Reads file in sas7bdat (SAS) native format.                                                                                                                              |
| [Script](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311840/Script)                                                                                                                                                  | Experimental   | Calculate values by scripting in Ruby, Python, Groovy, Javascript, and other scripting languages.                                                                        |
| [Select Values](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/select-values)                                                                                                                 | Transform      | Select or remove fields in a row. Optionally, set the field meta-data: type, length and precision.                                                                       |
| [Send message to Syslog](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311275/Send+message+to+Syslog)                                                                                                                  | Utility        | Send message to Syslog server.                                                                                                                                           |
| [Serialize to file](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558131/Serialize+to+file)                                                                                                                            | Output         | Write rows of data to a data cube.                                                                                                                                       |
| [Set Field Value](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/set-field-value)                                                                                                             | Transform      | Replace value of a field with another value field.                                                                                                                       |
| [Set Field Value to a Constant](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/set-field-value-to-a-constant)                                                                                 | Transform      | Replace value of a field to a constant.                                                                                                                                  |
| [Set files in result](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558239/Set+files+in+result)                                                                                                                        | Job            | Set filenames in the result of this transformation. Subsequent job entries can then use this information.                                                                |
| [Set Session Variables](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388317882/Set+Session+Variables)                                                                                                                    | Pentaho Server | Set the value of session variable.                                                                                                                                       |
| [Set Variables](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558231/Set+Variables)                                                                                                                                    | Job            | Set environment variables based on a single input row.                                                                                                                   |
| [SFTP Put](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311842/SFTP+Put)                                                                                                                                              | Experimental   | Upload a file or a stream file to a remote host via SFTP.                                                                                                                |
| [Shared dimension](https://docs.pentaho.com/pdia-data-integration/extracting-data-into-pdi/work-with-the-streamlined-data-refinery/use-the-streamlined-data-refinery/building-blocks-for-the-sdr/using-the-shared-dimension-step-for-sdr) | Flow           | Refine your data for the Streamlined Data Refinery through the creation of dimensions which can be shared.                                                               |
| [Simple Mapping (sub-transformation)](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/simple-mapping-sub-transformation)                                                                       | Mapping        | Turn a repetitive, re-usable part of a transformation (a sequence of steps) into a mapping (sub-transformation).                                                         |
| [Single Threader](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/single-threader)                                                                                                             | Flow           | Execute a sequence of steps in a single thread.                                                                                                                          |
| [Socket reader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558243/Socket+reader)                                                                                                                                    | Inline         | Read a socket. A socket client that connects to a server (Socket Writer step).                                                                                           |
| [Socket writer](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558244/Socket+writer)                                                                                                                                    | Inline         | Write a socket. A socket server that can send rows of data to a socket reader.                                                                                           |
| [Sort rows](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/sort-rows-transformation-step)                                                                                                     | Transform      | Sort rows based upon field values (ascending or descending).                                                                                                             |
| [Sorted Merge](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558198/Sorted+Merge)                                                                                                                                      | Joins          | Merge rows coming from multiple input steps providing these rows are sorted themselves on the given key fields.                                                          |
| [Split field to rows](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081873/Split+field+to+rows)                                                                                                                        | Transform      | Split a single string field by delimiter and creates a new row for each split term.                                                                                      |
| [Split Fields](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/split-fields)                                                                                                                   | Transform      | Split a single field into more then one.                                                                                                                                 |
| [Splunk Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/splunk-input)                                                                                                                   | Transform      | Read data from Splunk.                                                                                                                                                   |
| [Splunk Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/splunk-output)                                                                                                                 | Transform      | Write data to Splunk.                                                                                                                                                    |
| [SQL File Output](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081683/SQL+File+Output)                                                                                                                                | Output         | Output SQL INSERT statements to a file.                                                                                                                                  |
| [Stream lookup](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558139/Stream+Lookup)                                                                                                                                    | Lookup         | Look up values coming from another stream in the transformation.                                                                                                         |
| [String Operations](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/string-operations)                                                                                                         | Transform      | Apply certain operations like trimming, padding, and others to string value.                                                                                             |
| [Strings cut](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/strings-cut)                                                                                                                     | Transform      | Cut out a snippet of a string.                                                                                                                                           |
| [Switch / Case](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/switch-case)                                                                                                                   | Flow           | Switch a row to a certain target step based on the case value in a field.                                                                                                |
| [Synchronize after merge](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/376443876/Synchronize+after+merge)                                                                                                                | Output         | Perform insert/update/delete in one go based on the value of a field.                                                                                                    |
| [Table Compare](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388302925/Table+Compare)                                                                                                                                    | Utility        | Compare the data from two tables (provided they have the same lay-out). It'll find differences between the data in the two tables and log it.                            |
| [Table exists](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372703380/Table+Exists)                                                                                                                                      | Lookup         | Check if a table exists on a specified connection.                                                                                                                       |
| [Table Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/table-input)                                                                                                                     | Input          | Read information from a database table.                                                                                                                                  |
| [Table Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/table-output)                                                                                                                   | Output         | Write information to a database table.                                                                                                                                   |
| [Teradata Fastload Bulk Loader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386798585/Teradata+Fastload+Bulk+Loader)                                                                                                    | Bulk loading   | Bulk load Teradata Fastload data.                                                                                                                                        |
| [Teradata TPT Insert Upsert Bulk Loader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388310916/Teradata+TPT+Insert+Upsert+Bulk+Loader)                                                                                  | Bulk loading   | Bulk load via TPT using the tbuild command.                                                                                                                              |
| [Text File Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/text-file-input-cp)                                                                                                          | Input          | Read data from a text file in several formats. This data can then be passed to your next step(s).                                                                        |
| Text file input (deprecated)                                                                                                                                                                                                              | Deprecated     | Replaced by [Text File Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/text-file-input-cp).                            |
| [Text File Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/text-file-output-cp)                                                                                                        | Output         | Write rows to a text file.                                                                                                                                               |
| Text file output (deprecated)                                                                                                                                                                                                             | Deprecated     | Replaced by [Text File Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/text-file-output-cp).                          |
| [Transformation Executor](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/transformation-executor)                                                                                             | Flow           | Run a PDI transformation, sets parameters, and passes rows.                                                                                                              |
| [Unique Rows](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/unique-rows)                                                                                                                     | Transform      | Remove double rows and leave only unique occurrences. This works only on a sorted input. If the input is not sorted, only double consecutive rows are handled correctly. |
| [Unique Rows (HashSet)](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/unique-rows-hashset)                                                                                                   | Transform      | Remove double rows and leave only unique occurrences by using a HashSet.                                                                                                 |
| [Univariate Statistics](https://pentaho-community.atlassian.net/wiki/spaces/DATAMINING/pages/276956306/Using+the+Univariate+Statistics+Plugin)                                                                                            | Statistics     | Compute some simple stats based on a single input field.                                                                                                                 |
| [Update](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558127/Update)                                                                                                                                                  | Output         | Update data in a database table based upon keys.                                                                                                                         |
| [User Defined Java Class](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/user-defined-java-class)                                                                                             | Scripting      | Program a step using Java code.                                                                                                                                          |
| [User Defined Java Expression](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/376439964/User+Defined+Java+Expression)                                                                                                      | Scripting      | Calculate the result of a Java Expression using Janino.                                                                                                                  |
| [Value Mapper](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558180/Value+Mapper)                                                                                                                                      | Transform      | Map values of a certain field from one value to another.                                                                                                                 |
| [Vertica Bulk Loader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311122/Vertica+Bulk+Loader)                                                                                                                        | Bulk loading   | Bulk load data into a Vertica table using their high performance COPY feature.                                                                                           |
| [Web services lookup](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372703383/Web+services+lookup)                                                                                                                        | Lookup         | Look up information using web services (WSDL).                                                                                                                           |
| [Write metadata to HCP objects](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/write-metadata-to-hcp)                                                                                         | Output         | Write custom metadata fields to a Hitachi Content Platform object.                                                                                                       |
| Write metadata                                                                                                                                                                                                                            | Deprecated     | Add new metadata to metadata in the Pentaho Data Catalog that is associated with specific data resources.                                                                |
| [Write to log](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386806559/Write+to+log+step)                                                                                                                                 | Utility        | Write data to log.                                                                                                                                                       |
| [XBase input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558120/XBase+Input)                                                                                                                                        | Input          | Read records from an XBase type of database file (DBF).                                                                                                                  |
| [XML Input Stream (StAX)](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/xml-input-stream-stax)                                                                                               | Input          | Process very large and complex XML files very fast.                                                                                                                      |
| [XML Join](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/370967515/XML+Join)                                                                                                                                              | Joins          | Join a stream of XML-Tags into a target XML string.                                                                                                                      |
| [XML Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/xml-output-cp)                                                                                                                    | Output         | Write data to an XML file.                                                                                                                                               |
| [XSD Validator](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081877/XSD+Validator)                                                                                                                                    | Validation     | Validate XML source (files or streams) against XML Schema Definition.                                                                                                    |
| [XSL Transformation](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081880/XSL+Transformation)                                                                                                                          | Transform      | Transform XML stream using XSL (eXtensible Stylesheet Language).                                                                                                         |
| [Yaml Input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311572/Yaml+Input)                                                                                                                                          | Input          | Read YAML source (file or stream) parse them and convert them to rows and writes these to one or more output.                                                            |
| [Zip File](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388302923)                                                                                                                                                       | Utility        | Create a standard ZIP archive from the data stream fields.                                                                                                               |


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
