# PDI transformation steps

Steps extend and expand the functionality of Pentaho Data Integration (PDI) transformations. You can use the following steps in PDI.

## Steps: A - F

| Name                                                                                                                                                                                                                            | Category       | Description                                                                                                                                                                                                                   |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [Abort](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/abort)                                                                                                                       | Flow           | Abort a transformation.                                                                                                                                                                                                       |
| [Add a Checksum](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/add-a-checksum)                                                                                                     | Transform      | Add a checksum column for each input row.                                                                                                                                                                                     |
| [Add constants](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558172/Add+Constants)                                                                                                                          | Transform      | Add one or more constants to the input rows.                                                                                                                                                                                  |
| [Add sequence](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/add-sequence-step-article)                                                                                            | Transform      | Get the next value from a sequence.                                                                                                                                                                                           |
| [Add value fields changing sequence](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386799997/Add+value+fields+changing+sequence)                                                                                | Transform      | Add sequence depending of fields value change. Each time value of at least one field change, PDI will reset sequence.                                                                                                         |
| [Add XML](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/370967522/Add+XML)                                                                                                                                      | Transform      | Encode several fields into an XML fragment.                                                                                                                                                                                   |
| [AMQP Consumer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/amqp-consumer)                                                                                                       | Streaming      | Pull streaming data from an AMQP broker or clients through an AMQP transformation.                                                                                                                                            |
| [AMQP Producer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/amqp-producer)                                                                                                       | Streaming      | Publish messages in near-real-time to an AMQP broker.                                                                                                                                                                         |
| [Analytic query](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/375133731/Analytic+Query)                                                                                                                        | Statistics     | Execute analytic queries over a sorted dataset (LEAD/LAG/FIRST/LAST).                                                                                                                                                         |
| [Annotate stream](https://docs.pentaho.com/pdia-data-integration/extracting-data-into-pdi/work-with-the-streamlined-data-refinery/use-the-streamlined-data-refinery/building-blocks-for-the-sdr/using-the-annotate-stream-step) | Flow           | Refine your data for the Streamlined Data Refinery by creating measures, link dimensions, or attributes on stream fields.                                                                                                     |
| [Append streams](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081851/Append+streams)                                                                                                                        | Flow           | Append two streams in an ordered way.                                                                                                                                                                                         |
| [ARFF output](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311060/ARFF+Output)                                                                                                                              | Data Mining    | Write data in ARFF format to a file.                                                                                                                                                                                          |
| [Automatic Documentation Output](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386793665/Automatic+Documentation+Output)                                                                                        | Output         | Generate documentation automatically based on input in the form of a list of transformations and jobs.                                                                                                                        |
| [Avro Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/avro-input)                                                                                                             | Big Data       | Decode binary or JSON Avro data and extracts fields from the structure it defines, either from flat files or incoming fields.                                                                                                 |
| Avro input (deprecated)                                                                                                                                                                                                         | Deprecated     | Replaced by [Avro Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/avro-input).                                                                                              |
| [Avro Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/avro-output)                                                                                                           | Big Data       | Serialize data into Avro binary or JSON format from the PDI data stream, then writes it to file.                                                                                                                              |
| [Block this step until steps finish](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386799999/Block+this+step+until+steps+finish)                                                                                | Flow           | Block this step until selected steps finish.                                                                                                                                                                                  |
| [Blocking step](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558183/Blocking+step)                                                                                                                          | Flow           | Block flow until all incoming rows have been processed. Subsequent steps only receive the last input row to this step.                                                                                                        |
| [Calculator](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/calculator)                                                                                                             | Transform      | Create new fields by performing simple calculations.                                                                                                                                                                          |
| [Call DB Procedure](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558140/Call+DB+Procedure)                                                                                                                  | Lookup         | Get back information by calling a database procedure.                                                                                                                                                                         |
| [Call Endpoint](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388317884/Call+Endpoint)                                                                                                                          | Pentaho Server | Call API endpoints from the Pentaho Server within a PDI transformation.                                                                                                                                                       |
| Cassandra Input                                                                                                                                                                                                                 | Deprecated     | No longer a part of the PDI disctribution. Contact [Pentaho Support](https://support.pentaho.com/) for details.                                                                                                               |
| Cassandra Output                                                                                                                                                                                                                | Deprecated     | No longer a part of the PDI disctribution. Contact [Pentaho Support](https://support.pentaho.com/) for details.                                                                                                               |
| Catalog Input                                                                                                                                                                                                                   | Deprecated     | Read CSV text file formats of a Pentaho Data Catalog resource that is stored in a Hadoop Distributed File System ( HDFS) or Amazon S3 ecosystem, and then output the data as table rows that can be used by a transformation. |
| Catalog Output                                                                                                                                                                                                                  | Deprecated     | Encode CSV text file formats by using the schema that is defined in PDI to create or replace a data resource in Pentaho Data Catalogand add metadata to the data resource.                                                    |
| [Change file encoding](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386800011/Change+file+encoding)                                                                                                            | Utility        | Change file encoding and create a new file.                                                                                                                                                                                   |
| [Check if a column exists](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372703368/Check+if+a+column+exists)                                                                                                    | Lookup         | Check if a column exists in a table on a specified connection.                                                                                                                                                                |
| [Check if file is locked](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386800024/Check+if+file+is+locked)                                                                                                      | Lookup         | Check if a file is locked by another process.                                                                                                                                                                                 |
| [Check if webservice is available](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386793916/Check+if+webservice+is+available)                                                                                    | Lookup         | Check if a webservice is available.                                                                                                                                                                                           |
| [Clone row](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081860/Clone+row)                                                                                                                                  | Utility        | Clone a row as many times as needed.                                                                                                                                                                                          |
| [Closure Generator](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/364316814/Closure+Generator)                                                                                                                  | Transform      | Generate a closure table using parent-child relationships.                                                                                                                                                                    |
| [Combination lookup/update](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558225/Combination+lookup-update)                                                                                                  | Data Warehouse | Update a junk dimension in a data warehouse. Alternatively, look up information in this dimension. The primary key of a junk dimension are all the fields.                                                                    |
| [Concat Fields](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386803438/Concat+Fields)                                                                                                                          | Transform      | Concatenate multiple fields into one target field. The fields can be separated by a separator and the enclosure logic is completely compatible with the Text File Output step.                                                |
| [Copybook Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/copybook-input-pdi-step)                                                                                            | Discovery      | Reads binary data files that are mapped by a fixed-length COBOL copybook definition file.                                                                                                                                     |
| [Copy rows to result](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558228/Copy+rows+to+result)                                                                                                              | Job            | Write rows to the executing job. The information will then be passed to the next entry in this job.                                                                                                                           |
| [CouchDB Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/couchdb-input)                                                                                                       | Big Data       | Retrieve all documents from a given view in a given design document from a given database.                                                                                                                                    |
| [Credit card validator](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386800027/Credit+card+validator)                                                                                                          | Validation     | Determines if a credit card number is valid (uses LUHN10 (MOD-10) algorithm), and which credit card vendor handles that number (VISA, MasterCard, Diners Club, EnRoute, American Express (AMEX),...).                         |
| [CSV File Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/csv-file-input)                                                                                                     | Input          | Read from a simple CSV file input.                                                                                                                                                                                            |
| [Data Grid](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386800034/Data+Grid)                                                                                                                                  | Input          | Enter rows of static data in a grid, usually for testing, reference or demo purpose.                                                                                                                                          |
| [Data Validator](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/367624189/Data+Validator)                                                                                                                        | Validation     | Validates passing data based on a set of rules.                                                                                                                                                                               |
| [Database join](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558192/Database+Join)                                                                                                                          | Lookup         | Execute a database query using stream values as parameters.                                                                                                                                                                   |
| [Database lookup](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558138/Database+lookup)                                                                                                                      | Lookup         | Look up values in a database using field values.                                                                                                                                                                              |
| [De-serialize from file](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558119/De-serialize+from+file)                                                                                                        | Input          | Read rows of data from a data cube.                                                                                                                                                                                           |
| [Delay row](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081863/Delay+row)                                                                                                                                  | Utility        | Output each input row after a delay.                                                                                                                                                                                          |
| [Delete](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/delete-step-pdi)                                                                                                            | Output         | Permanently removes a row from a database.                                                                                                                                                                                    |
| [Detect empty stream](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386800030/Detect+empty+stream)                                                                                                              | Flow           | Output one empty row if input stream is empty (I.e. when input stream does not contain any row).                                                                                                                              |
| [Discover metadata from a text file](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/discover-metadata-from-a-text-file)                                                             | Input          | Determines the structure of delimited text files.                                                                                                                                                                             |
| [Dimension lookup/update](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558220/Dimension+Lookup-Update)                                                                                                      | Data Warehouse | Update a slowly changing dimension in a data warehouse. Alternatively, look up information in this dimension.                                                                                                                 |
| [Dummy (do nothing)](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558152/Dummy+do+nothing)                                                                                                                  | Flow           | Does not do anything. It is useful, however, when testing things or in certain situations where you want to split streams.                                                                                                    |
| [Dynamic SQL row](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/376444048/Dynamic+SQL+row)                                                                                                                      | Lookup         | Execute dynamic SQL statement build in a previous field.                                                                                                                                                                      |
| [Edi to XML](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386798790/Edi+to+XML)                                                                                                                                | Utility        | Convert an Edifact message to XML to simplify data extraction.                                                                                                                                                                |
| ElasticSearch Bulk Insert (deprecated)                                                                                                                                                                                          | Deprecated     | Replaced by [Elasticsearch REST bulk insert](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/elasticsearch-rest-bulk-insert).                                                      |
| [Elasticsearch REST bulk insert](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/elasticsearch-rest-bulk-insert)                                                                     | Bulk loading   | Perform bulk inserts into Elasticsearch.                                                                                                                                                                                      |
| [Email messages input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311043/Email+Messages+Input)                                                                                                            | Input          | Read POP3/IMAP server and retrieve messages.                                                                                                                                                                                  |
| [ESRI Shapefile Reader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311488/ESRI+Shapefile+Reader)                                                                                                          | Input          | Read shape file data from an ESRI shape file and linked DBF file.                                                                                                                                                             |
| [ETL metadata injection](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/etl-metadata-injection)                                                                                     | Flow           | Inject metadata into an existing transformation prior to execution. This allows for the creation of dynamic and highly flexible data integration solutions.                                                                   |
| Example step (deprecated)                                                                                                                                                                                                       | Deprecated     | Is an example of a plugin test step.                                                                                                                                                                                          |
| [Execute a process](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/376442106/Execute+a+process)                                                                                                                  | Utility        | Execute a process and return the result.                                                                                                                                                                                      |
| [Execute Row SQL Script](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/execute-row-sql-script-cp)                                                                                  | Scripting      | Execute an SQL statement or file for every input row.                                                                                                                                                                         |
| [Execute SQL Script](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/execute-sql-script-cp)                                                                                          | Scripting      | Execute an SQL script, optionally parameterized using input rows.                                                                                                                                                             |
| [Extract to rows](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/extract-to-rows)                                                                                                   | Input          | Parses hierarchical data type fields coming from a previous step.                                                                                                                                                             |
| [File exists (Step)](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/file-exists-step)                                                                                               | Lookup         | Check if a file exists.                                                                                                                                                                                                       |
| [Filter Rows](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558145/Filter+Rows)                                                                                                                              | Flow           | Filter rows using simple equations.                                                                                                                                                                                           |
| [Fixed file input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558257/Fixed+File+Input)                                                                                                                    | Input          | Read from a fixed file input.                                                                                                                                                                                                 |
| [Formula](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/376442115/Formula)                                                                                                                                      | Scripting      | Calculate a formula using Pentaho's libformula.                                                                                                                                                                               |
| [Fuzzy match](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388310481/Fuzzy+match)                                                                                                                              | Lookup         | Find the approximate matches to a string using matching algorithms. Read a field from a main stream and output approximative value from lookup stream.                                                                        |

## Steps: G - L

| Name                                                                                                                                                   | Category       | Description                                                                                                                                                                     |
| ------------------------------------------------------------------------------------------------------------------------------------------------------ | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [Generate random credit card numbers](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386800040/Generate+random+credit+card+numbers)     | Input          | Generate random valid (luhn check) credit card numbers.                                                                                                                         |
| [Generate random value](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558100/Generate+Random+Value)                                 | Input          | Generate random value.                                                                                                                                                          |
| [Generate Rows](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/generate-rows)                              | Input          | Generate a number of empty or equal rows.                                                                                                                                       |
| [Get data from XML](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081230/Get+Data+From+XML)                                         | Input          | Get data from XML file by using XPath. This step also allows you to parse XML defined in a previous field.                                                                      |
| [Get File Names](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558124/Get+File+Names)                                               | Input          | Get file names from the operating system and send them to the next step.                                                                                                        |
| [Get files from result](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558238/Get+files+from+result)                                 | Job            | Read filenames used or generated in a previous entry in a job.                                                                                                                  |
| [Get Files Rows Count](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558265/Get+Files+Rows+Count)                                   | Input          | Get files rows count.                                                                                                                                                           |
| [Get ID from slave server](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/385450958/Get+ID+from+Slave+Server)                           | Transform      | Retrieve unique IDs in blocks from a slave server. The referenced sequence needs to be configured on the slave server in the XML configuration file.                            |
| [Get records from stream](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/get-records-from-stream)          | Streaming      | Return records that were previously generated by another transformation in a job.                                                                                               |
| [Get repository names](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386800043/Get+repository+names)                                   | Input          | List detailed information about transformations and/or jobs in a repository.                                                                                                    |
| [Get rows from result](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/get-rows-from-result)                | Job            | Read rows from a previous entry in a job.                                                                                                                                       |
| [Get Session Variables](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388317880/Get+Session+Variables)                                 | Pentaho Server | Retrieve the value of a session variable.                                                                                                                                       |
| [Get SubFolder names](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311493/Get+SubFolder+names)                                     | Input          | Read a parent folder and return all subfolders.                                                                                                                                 |
| [Get System Info](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/get-system-info)                          | Input          | Get information from the system like system date, arguments, etc.                                                                                                               |
| [Get table names](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311496/Get+table+names)                                             | Input          | Get table names from database connection and send them to the next step.                                                                                                        |
| [Get Variables](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558237/Get+Variable)                                                  | Job            | Determine the values of certain (environment or Kettle) variables and put them in field values.                                                                                 |
| Google Analytics (deprecated                                                                                                                           | Deprecated     | Fetch data from google analytics account. Replacement step is Google Analytics v4. **Note:** This step will only work for Universal Analytics 360 customers until July 1, 2024. |
| [Google Analytics v4](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/google-analytics-v4)                  | Input          | Fetch data from Google Analytics account.                                                                                                                                       |
| Greenplum Bulk Loader (deprecated)                                                                                                                     | Deprecated     | Bulk load Greenplum data. Replacement step is [Greenplum Load](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/384370894/Greenplum+Load).                         |
| [Greenplum Load](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/384370894/Greenplum+Load)                                               | Bulk loading   | Bulk load Greenplum data.                                                                                                                                                       |
| [Group By](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/group-by-landing-page-article)                   | Statistics     | Build aggregates in a group by fashion. This works only on a sorted input. If the input is not sorted, only double consecutive rows are handled correctly.                      |
| [GZIP CSV Input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/377094151/GZIP+CSV+Input)                                               | Input          | Read in parallel from a GZIP CSV file.                                                                                                                                          |
| [Hadoop File Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/hadoop-file-input-cp-main-page)         | Big Data       | Read data from a variety of different text-file types stored on a Hadoop cluster.                                                                                               |
| [Hadoop File Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/hadoop-file-output-cp-main-page)       | Big Data       | Write data to a variety of different text-file types stored on a Hadoop cluster.                                                                                                |
| [HBase Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/hbase-input-cp-main-page)                     | Big Data       | Read from an HBase column family.                                                                                                                                               |
| [HBase Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/hbase-output)                                | Big Data       | Write to an HBase column family.                                                                                                                                                |
| [HBase row decoder](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/hbase-row-decoder-pdi)                  | Big Data       | Decodes an incoming key and HBase result object to a mapping.                                                                                                                   |
| [Hierarchical JSON input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/hierarchical-json-input)          | Input          | Load JSON data into PDI from a previous step or from a file.                                                                                                                    |
| [Hierarchical JSON Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/hierarchical-json-output)        | Output         | Converts hierarchical data from a previous step into JSON format.                                                                                                               |
| [HL7 Input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386798456/HL7+Input)                                                         | Input          | Read data from HL7 data streams.                                                                                                                                                |
| [HTTP client](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558141/HTTP+Client)                                                     | Lookup         | Call a web service over HTTP by supplying a base URL by allowing parameters to be set dynamically.                                                                              |
| [HTTP Post](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/376441850/HTTP+Post)                                                         | Lookup         | Call a web service request over HTTP by supplying a base URL by allowing parameters to be set dynamically.                                                                      |
| [IBM WebSphere MQ Consumer (deprecated)](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388310162/IBM+Websphere+MQ+Consumer+Deprecated) | Deprecated     | Receive messages from any IBM WebSphere MQ Server.                                                                                                                              |
| [IBM WebSphere MQ Producer (deprecated)](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388310166/IBM+Websphere+MQ+Producer+Deprecated) | Deprecated     | Send messages to any IBM WebSphere MQ Server.                                                                                                                                   |
| [Identify last row in a stream](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388310476/Identify+last+row+in+a+stream)                 | Flow           | Mark the last row.                                                                                                                                                              |
| [If field value is null](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/376442127/If+field+value+is+null)                               | Utility        | Set a field value to a constant if it is null.                                                                                                                                  |
| [Infobright Loader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386800365/Infobright+Loader)                                         | Bulk loading   | Load data to an Infobright database table.                                                                                                                                      |
| [Ingres VectorWise Bulk Loader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/385450759/Ingres+VectorWise+Bulk+Loader)                 | Bulk loading   | Interface with the Ingres VectorWise Bulk Loader "COPY TABLE" command.                                                                                                          |
| [Injector](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558241/Injector)                                                           | Inline         | Inject rows into the transformation through the java API.                                                                                                                       |
| [Insert / Update](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558126/Insert+-+Update)                                             | Output         | Update or insert rows in a database based upon keys.                                                                                                                            |
| [Java Filter](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/java-filter-pdi-step)                         | Flow           | Filter rows using java code.                                                                                                                                                    |
| [JMS Consumer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/jms-consumer)                                | Streaming      | Receive messages from a JMS server.                                                                                                                                             |
| [JMS consumer (deprecated)](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388310137/JMS+Consumer+Deprecated)                           | Deprecated     | Replaced by [JMS Consumer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/jms-consumer).                                            |
| [JMS Producer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/jms-producer)                                | Streaming      | Send messages to a JMS server.                                                                                                                                                  |
| [JMS producer (deprecated)](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388310145/JMS+Producer+Deprecated)                           | Deprecated     | Replaced by [JMS Producer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/jms-producer).                                            |
| [Job Executor](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/job-executor)                                | Flow           | Run a PDI job, and passes parameters and rows.                                                                                                                                  |
| [Join Rows (cartesian product)](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558187/Join+Rows+Cartesian+product)                   | Joins          | Output the cartesian product of the input streams. The number of rows is the multiplication of the number of rows in the input streams.                                         |
| [JSON Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/json-input)                                    | Input          | Extract relevant portions out of JSON structures (file or incoming field) and output rows.                                                                                      |
| [JSON output](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388309851/JSON+output)                                                     | Output         | Create JSON block and output it in a field to a file.                                                                                                                           |
| [Kafka consumer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/kafka-consumer)                            | Streaming      | Run a sub-transformation that executes according to message batch size or duration, letting you process a continuous stream of records in near-real-time.                       |
| [Kafka Producer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/kafka-producer)                            | Streaming      | Publish messages in near-real-time across worker nodes where multiple, subscribed members have access.                                                                          |
| [Kinesis Consumer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/kinesis-consumer)                        | Streaming      | Extract data from a specific stream located within the Amazon Kinesis Data Streams service.                                                                                     |
| [Kinesis Producer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/kinesis-producer)                        | Streaming      | Push data to an existing region and stream located within the Amazon Kinesis Data Streams service.                                                                              |
| [Knowledge Flow](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311062/Knowledge+Flow)                                               | Data Mining    | Executes a Knowledge Flow data mining process.                                                                                                                                  |
| [LDAP Input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558260/LDAP+Input)                                                       | Input          | Read data from LDAP host.                                                                                                                                                       |
| [LDAP Output](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311403/LDAP+Output)                                                     | Output         | Perform Insert, upsert, update, add or delete operations on records based on their DN (Distinguished Name).                                                                     |
| [LDIF Input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081239/LDIF+Input)                                                       | Input          | Read data from LDIF files.                                                                                                                                                      |
| [Load file content in memory](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311567/Load+file+content+in+memory)                     | Input          | Load file content in memory.                                                                                                                                                    |
| LucidDB streaming loader (deprecated)                                                                                                                  | Deprecated     | Load data into LucidDB by using Remote Rows UDX.                                                                                                                                |

\## Steps: M - R

| Name                                                                                                                                                                | Category           | Description                                                                                                                                                                                                                       |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [Mail](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386793725/Mail+step)                                                                           | Utility            | Send e-mail.                                                                                                                                                                                                                      |
| [Mail Validator](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081634/Mail+Validator)                                                            | Validation         | Check if an email address is valid.                                                                                                                                                                                               |
| [Mapping](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/mapping)                                                       | Mapping            | Run a mapping (sub-transformation), use MappingInput and MappingOutput to specify the fields interface.                                                                                                                           |
| [Mapping Input Specification](https://docs.pentaho.com/pdia-data-integration/broken-reference)                                                                      | Mapping            | Specify the input interface of a mapping.                                                                                                                                                                                         |
| [Mapping Output Specification](https://docs.pentaho.com/pdia-data-integration/broken-reference)                                                                     | Mapping            | Specify the output interface of a mapping.                                                                                                                                                                                        |
| [MapReduce Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/mapreduce-input)                                       | Big Data           | Enter Key Value pairs from Hadoop MapReduce.                                                                                                                                                                                      |
| [MapReduce Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/mapreduce-output)                                     | Big Data           | Exit Key Value pairs, then push into Hadoop MapReduce.                                                                                                                                                                            |
| [MaxMind GeoIP Lookup](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/364316813/Pentaho+Data+Integration+Steps)                                      | Lookup             | Lookup an IPv4 address in a MaxMind database and add fields such as geography, ISP, or organization.                                                                                                                              |
| [Memory Group By](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/memory-group-by)                                       | Statistics         | Build aggregates in a group by fashion. This step doesn't require sorted input.                                                                                                                                                   |
| [Merge Join](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558201/Merge+Join)                                                                    | Joins              | Join two streams on a given key and outputs a joined set. The input streams must be sorted on the join key.                                                                                                                       |
| [Merge Rows (diff)](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/merge-rows-diff)                                     | Joins              | Merge two streams of rows, sorted on a certain key. The two streams are compared and the equals, changed, deleted and new rows are flagged.                                                                                       |
| [Metadata structure of stream](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386794485/Metadata+Structure+of+Stream)                                | Utility            | Read the metadata of the incoming stream.                                                                                                                                                                                         |
| [Microsoft Access Input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558259/Access+Input)                                                      | Input              | Read data from a Microsoft Access file                                                                                                                                                                                            |
| [Microsoft Access Output](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558136/Access+Output)                                                    | Output             | Store records into an MS-Access database table.                                                                                                                                                                                   |
| [Microsoft Excel Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/microsoft-excel-input)                           | Input              | Read data from Excel and OpenOffice Workbooks (XLS, XLSX, ODS).                                                                                                                                                                   |
| Microsoft Excel Output (deprecated)                                                                                                                                 | Deprecated         | Replaced by [Microsoft Excel writer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/microsoft-excel-writer).                                                                          |
| [Microsoft Excel writer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/microsoft-excel-writer)                         | Output             | Write or appends data to an Excel file.                                                                                                                                                                                           |
| [Modified Java Script Value](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/modified-java-script-value)                 | Scripting          | Run JavaScript programs (and much more).                                                                                                                                                                                          |
| [Modify values from a single row](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/modify-values-from-a-single-row)       | Input              | Build complex hierarchical data.                                                                                                                                                                                                  |
| [Modify values from grouped rows](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/modify-values-from-grouped-rows)       | Input              | Modifies hierarchical data to form nested JSON key-value pairs.                                                                                                                                                                   |
| [Mondrian Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/mondrian-input-pdi-transformation-step-cp)              | Input              | Execute and retrieve data using an MDX query against a Pentaho Analyses OLAP server (Mondrian).                                                                                                                                   |
| [MonetDB Bulk Loader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386803510/MonetDB+bulk+loader)                                                  | Bulk loading       | Load data into MonetDB by using their bulk load command in streaming mode.                                                                                                                                                        |
| [MongoDB Execute](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/mongodb-execute)                                       | Big Data           | Connects to a MongoDB cluster and executes Mongo shell-style commands.                                                                                                                                                            |
| [MongoDB Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/mongodb-input)                                           | Big Data           | Read all entries from a MongoDB collection in the specified database.                                                                                                                                                             |
| [MongoDB Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/mongodb-output)                                         | Big Data           | Write to a MongoDB collection.                                                                                                                                                                                                    |
| [MQTT Consumer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/mqtt-consumer)                                           | Streaming          | Pull streaming data from an MQTT broker or clients through an MQTT transformation.                                                                                                                                                |
| [MQTT Producer](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/mqtt-producer)                                           | Streaming          | Publish messages in near-real-time to an MQTT broker.                                                                                                                                                                             |
| [Multiway Merge Join](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388303826/Multiway+Merge+Join)                                                  | Joins              | Join multiple streams. This step supports INNER and FULL OUTER joins.                                                                                                                                                             |
| [MySQL Bulk Loader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/378110379/MySQL+Bulk+Loader)                                                      | Bulk loading       | Load data over a named pipe (not available on MS Windows).                                                                                                                                                                        |
| [Null if...](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558164/Null+If)                                                                       | Utility            | Set a field value to null if it is equal to a constant value.                                                                                                                                                                     |
| [Number range](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388310492/Number+range)                                                                | Transform          | Create ranges based on numeric field.                                                                                                                                                                                             |
| [OLAP Input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/384370419/OLAP+Input)                                                                    | Input              | Execute and retrieve data using an MDX query against any XML/A OLAP datasource using olap4j.                                                                                                                                      |
| OpenERP object delete (deprecated)                                                                                                                                  | Deprecated         | Delete data from the OpenERP server using the XMLRPC interface with the 'unlink' function.                                                                                                                                        |
| OpenERP object input (deprecated)                                                                                                                                   | Deprecated         | Retrieve data from the OpenERP server using the XMLRPC interface with the 'read' function.                                                                                                                                        |
| OpenERP object output (deprecated)                                                                                                                                  | Deprecated         | Update data on the OpenERP server using the XMLRPC interface and the 'import' function                                                                                                                                            |
| [Oracle Bulk Loader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558251/Oracle+Bulk+Loader)                                                    | Bulk loading       | Use Oracle Bulk Loader to load data.                                                                                                                                                                                              |
| [ORC Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/orc-input)                                                   | Big Data           | Read fields data from ORC files into a PDI data stream.                                                                                                                                                                           |
| [ORC Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/orc-output)                                                 | Big Data           | Serialize data from the PDI data stream into an ORC file format and writes it to a file.                                                                                                                                          |
| [Output steps metrics](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311428/Output+Steps+Metrics)                                                | Statistics         | Return metrics for one or several steps.                                                                                                                                                                                          |
| Palo cell input (deprecated)                                                                                                                                        | Deprecated         | Retrieve all cell data from a Palo cube.                                                                                                                                                                                          |
| Palo cell output (deprecated)                                                                                                                                       | Deprecated         | Update cell data in a Palo cube.                                                                                                                                                                                                  |
| Palo dim input (deprecated)                                                                                                                                         | Deprecated         | Return elements from a dimension in a Palo database.                                                                                                                                                                              |
| Palo dim output (deprecated)                                                                                                                                        | Deprecated         | Create/update dimension elements and element consolidations in a Palo database.                                                                                                                                                   |
| [Parquet Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/parquet-input)                                           | Big Data           | Decode Parquet data formats and extracts fields from the structure it defines.                                                                                                                                                    |
| [Parquet Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/parquet-output)                                         | Big Data           | Map fields within data files and choose where you want to process those files.                                                                                                                                                    |
| [Pentaho Reporting Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/pentaho-reporting-output)                     | Output             | Execute an existing report file (.prpt).                                                                                                                                                                                          |
| [PostgreSQL Bulk Loader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372704366/PostgreSQL+Bulk+Loader)                                            | Bulk loading       | Bulk load PostgreSQL data.                                                                                                                                                                                                        |
| [Prioritize streams](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388309676/Prioritize+streams)                                                    | Flow               | Prioritize streams in an order way.                                                                                                                                                                                               |
| [Process files](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/376442129/Process+files)                                                              | Utility            | Process one file per row (copy or move or delete). This step only accept filename in input.                                                                                                                                       |
| [Properties Output](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388310203/Properties+Output)                                                      | Output             | Write data to properties file.                                                                                                                                                                                                    |
| [Property Input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081241/Property+Input)                                                            | Input              | Read data (key, value) from properties files.                                                                                                                                                                                     |
| [Python Executor](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/python-executor)                                       | Scripting          | Map upstream data from a PDI input step or execute a Python script to generate data. When you send all rows, Python stores the dataset in a variable that kicks off your Python script.                                           |
| [Query metadata from a database](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/query-metadata-from-a-database-article) | Metadata Discovery | Retrieves metadata from a database connection.                                                                                                                                                                                    |
| [Query HCP](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/query-hcp)                                                   | Input              | Uses the Metadata Query Engine (MQE) to query your Hitachi Content Platform (HCP) repository for objects, their URLs, and system metadata properties.                                                                             |
| [R script executor](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311468/R+script+executor)                                                      | Statistics         | Execute an R script within a PDI transformation.                                                                                                                                                                                  |
| [Read metadata from HCP](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/read-metadata-from-hcp)                         | Input              | Identifies an HCP object by its URL path then specifies a target annotation name to read.                                                                                                                                         |
| [Read metadata from Copybook](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/read-metadata-from-copybook)               | Metadata Discovery | Reads a binary fixed-length copybook definition file and outputs the file and column descriptor information as fields to PDI rows.                                                                                                |
| Read metadata                                                                                                                                                       | Deprecated         | Search and retrieve metadata in the Pentaho Data Catalog that is associated with specific data resources that are registered in Data Catalog.                                                                                     |
| [Regex Evaluation](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/regex-evaluation)                                     | Scripting          | Evaluate regular expressions. This step uses a regular expression to evaluate a field. It can also extract new fields out of an existing field with capturing groups.                                                             |
| [Replace in String](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/replace-in-string)                                   | Transform          | Replace all occurrences a word in a string with another word.                                                                                                                                                                     |
| [Reservoir Sampling](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388310819/Reservoir+Sampling)                                                    | Statistics         | Transform Samples a fixed number of rows from the incoming stream.                                                                                                                                                                |
| [REST client step](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/rest-client-step)                                     | Lookup             | Consume RESTful services. REpresentational State Transfer (REST) is a key design idiom that embraces a stateless client-server architecture in which the web services are viewed as resources and can be identified by their URLs |
| [Row Denormaliser](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/row-denormaliser)                                     | Transform          | Denormalise rows by looking up key-value pairs and by assigning them to new fields in the output rows. This method aggregates and needs the input rows to be sorted on the grouping fields.                                       |
| [Row Flattener](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/row-flattener)                                           | Transform          | Flatten consecutive rows based on the order in which they appear in the input stream.                                                                                                                                             |
| [Row Normaliser](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/row-normaliser)                                         | Transform          | Normalise de-normalised information.                                                                                                                                                                                              |
| [RSS Input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/376443633/RSS+Input)                                                                      | Input              | Read RSS feeds.                                                                                                                                                                                                                   |
| [RSS Output](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311425/RSS+Output)                                                                    | Output             | Read RSS stream.                                                                                                                                                                                                                  |
| [Rule Executor](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386794190/Rule+Executor)                                                              | Scripting          | Execute a rule against each row (using Drools).                                                                                                                                                                                   |
| [Rule Accumulator](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386794192/Rule+Accumulator)                                                        | Scripting          | Execute a rule against a set of rows (using Drools).                                                                                                                                                                              |
| [Run SSH commands](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386795027/Run+SSH+commands)                                                        | Utility            | Run SSH commands and returns result.                                                                                                                                                                                              |

## Steps: S - Z

| Name                                                                                                                                                                                                                                      | Category       | Description                                                                                                                                                              |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| [S3 CSV Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/s3-csv-input-cp)                                                                                                                | Input          | Read from an S3 CSV file.                                                                                                                                                |
| [S3 File Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/s3-file-output-cp)                                                                                                            | Output         | Export data to a text file on an Amazon Simple Storage Service (S3).                                                                                                     |
| [Salesforce bulk operation](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/salesforce-bulk-operation)                                                                                         | Bulk loading   | Perform bulk operations on Salesforce objects                                                                                                                            |
| [Salesforce Delete](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/salesforce-delete)                                                                                                         | Output         | Delete records in a Salesforce module.                                                                                                                                   |
| [Salesforce Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/salesforce-input)                                                                                                           | Input          | Read information from Salesforce.                                                                                                                                        |
| [Salesforce Insert](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/salesforce-insert)                                                                                                         | Output         | Insert records in a Salesforce module.                                                                                                                                   |
| [Salesforce Update](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/salesforce-update)                                                                                                         | Output         | Update records in a Salesforce module.                                                                                                                                   |
| [Salesforce Upsert](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/salesforce-upsert)                                                                                                         | Output         | Insert or update records in a Salesforce module.                                                                                                                         |
| [Sample rows](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386800015/Sample+rows)                                                                                                                                        | Statistics     | Filter rows based on the line number.                                                                                                                                    |
| SAP input (deprecated)                                                                                                                                                                                                                    | Deprecated     | Read data from SAP ERP, optionally with parameters.                                                                                                                      |
| [SAS Input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386795310/SAS+Input)                                                                                                                                            | Input          | Reads file in sas7bdat (SAS) native format.                                                                                                                              |
| [Script](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311840/Script)                                                                                                                                                  | Experimental   | Calculate values by scripting in Ruby, Python, Groovy, Javascript, and other scripting languages.                                                                        |
| [Select Values](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/select-values)                                                                                                                 | Transform      | Select or remove fields in a row. Optionally, set the field meta-data: type, length and precision.                                                                       |
| [Send message to Syslog](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311275/Send+message+to+Syslog)                                                                                                                  | Utility        | Send message to Syslog server.                                                                                                                                           |
| [Serialize to file](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558131/Serialize+to+file)                                                                                                                            | Output         | Write rows of data to a data cube.                                                                                                                                       |
| [Set Field Value](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/set-field-value)                                                                                                             | Transform      | Replace value of a field with another value field.                                                                                                                       |
| [Set Field Value to a Constant](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/set-field-value-to-a-constant)                                                                                 | Transform      | Replace value of a field to a constant.                                                                                                                                  |
| [Set files in result](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558239/Set+files+in+result)                                                                                                                        | Job            | Set filenames in the result of this transformation. Subsequent job entries can then use this information.                                                                |
| [Set Session Variables](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388317882/Set+Session+Variables)                                                                                                                    | Pentaho Server | Set the value of session variable.                                                                                                                                       |
| [Set Variables](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558231/Set+Variables)                                                                                                                                    | Job            | Set environment variables based on a single input row.                                                                                                                   |
| [SFTP Put](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311842/SFTP+Put)                                                                                                                                              | Experimental   | Upload a file or a stream file to a remote host via SFTP.                                                                                                                |
| [Shared dimension](https://docs.pentaho.com/pdia-data-integration/extracting-data-into-pdi/work-with-the-streamlined-data-refinery/use-the-streamlined-data-refinery/building-blocks-for-the-sdr/using-the-shared-dimension-step-for-sdr) | Flow           | Refine your data for the Streamlined Data Refinery through the creation of dimensions which can be shared.                                                               |
| [Simple Mapping (sub-transformation)](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/simple-mapping-sub-transformation)                                                                       | Mapping        | Turn a repetitive, re-usable part of a transformation (a sequence of steps) into a mapping (sub-transformation).                                                         |
| [Single Threader](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/single-threader)                                                                                                             | Flow           | Execute a sequence of steps in a single thread.                                                                                                                          |
| [Socket reader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558243/Socket+reader)                                                                                                                                    | Inline         | Read a socket. A socket client that connects to a server (Socket Writer step).                                                                                           |
| [Socket writer](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558244/Socket+writer)                                                                                                                                    | Inline         | Write a socket. A socket server that can send rows of data to a socket reader.                                                                                           |
| [Sort rows](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/sort-rows-transformation-step)                                                                                                     | Transform      | Sort rows based upon field values (ascending or descending).                                                                                                             |
| [Sorted Merge](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558198/Sorted+Merge)                                                                                                                                      | Joins          | Merge rows coming from multiple input steps providing these rows are sorted themselves on the given key fields.                                                          |
| [Split field to rows](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081873/Split+field+to+rows)                                                                                                                        | Transform      | Split a single string field by delimiter and creates a new row for each split term.                                                                                      |
| [Split Fields](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/split-fields)                                                                                                                   | Transform      | Split a single field into more then one.                                                                                                                                 |
| [Splunk Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/splunk-input)                                                                                                                   | Transform      | Read data from Splunk.                                                                                                                                                   |
| [Splunk Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/splunk-output)                                                                                                                 | Transform      | Write data to Splunk.                                                                                                                                                    |
| [SQL File Output](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081683/SQL+File+Output)                                                                                                                                | Output         | Output SQL INSERT statements to a file.                                                                                                                                  |
| [Stream lookup](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558139/Stream+Lookup)                                                                                                                                    | Lookup         | Look up values coming from another stream in the transformation.                                                                                                         |
| [String Operations](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/string-operations)                                                                                                         | Transform      | Apply certain operations like trimming, padding, and others to string value.                                                                                             |
| [Strings cut](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/strings-cut)                                                                                                                     | Transform      | Cut out a snippet of a string.                                                                                                                                           |
| [Switch / Case](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/switch-case)                                                                                                                   | Flow           | Switch a row to a certain target step based on the case value in a field.                                                                                                |
| [Synchronize after merge](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/376443876/Synchronize+after+merge)                                                                                                                | Output         | Perform insert/update/delete in one go based on the value of a field.                                                                                                    |
| [Table Compare](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388302925/Table+Compare)                                                                                                                                    | Utility        | Compare the data from two tables (provided they have the same lay-out). It'll find differences between the data in the two tables and log it.                            |
| [Table exists](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372703380/Table+Exists)                                                                                                                                      | Lookup         | Check if a table exists on a specified connection.                                                                                                                       |
| [Table Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/table-input)                                                                                                                     | Input          | Read information from a database table.                                                                                                                                  |
| [Table Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/table-output)                                                                                                                   | Output         | Write information to a database table.                                                                                                                                   |
| [Teradata Fastload Bulk Loader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386798585/Teradata+Fastload+Bulk+Loader)                                                                                                    | Bulk loading   | Bulk load Teradata Fastload data.                                                                                                                                        |
| [Teradata TPT Insert Upsert Bulk Loader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388310916/Teradata+TPT+Insert+Upsert+Bulk+Loader)                                                                                  | Bulk loading   | Bulk load via TPT using the tbuild command.                                                                                                                              |
| [Text File Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/text-file-input-cp)                                                                                                          | Input          | Read data from a text file in several formats. This data can then be passed to your next step(s).                                                                        |
| Text file input (deprecated)                                                                                                                                                                                                              | Deprecated     | Replaced by [Text File Input](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/text-file-input-cp).                            |
| [Text File Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/text-file-output-cp)                                                                                                        | Output         | Write rows to a text file.                                                                                                                                               |
| Text file output (deprecated)                                                                                                                                                                                                             | Deprecated     | Replaced by [Text File Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/text-file-output-cp).                          |
| [Transformation Executor](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/transformation-executor)                                                                                             | Flow           | Run a PDI transformation, sets parameters, and passes rows.                                                                                                              |
| [Unique Rows](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/unique-rows)                                                                                                                     | Transform      | Remove double rows and leave only unique occurrences. This works only on a sorted input. If the input is not sorted, only double consecutive rows are handled correctly. |
| [Unique Rows (HashSet)](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/unique-rows-hashset)                                                                                                   | Transform      | Remove double rows and leave only unique occurrences by using a HashSet.                                                                                                 |
| [Univariate Statistics](https://pentaho-community.atlassian.net/wiki/spaces/DATAMINING/pages/276956306/Using+the+Univariate+Statistics+Plugin)                                                                                            | Statistics     | Compute some simple stats based on a single input field.                                                                                                                 |
| [Update](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558127/Update)                                                                                                                                                  | Output         | Update data in a database table based upon keys.                                                                                                                         |
| [User Defined Java Class](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/user-defined-java-class)                                                                                             | Scripting      | Program a step using Java code.                                                                                                                                          |
| [User Defined Java Expression](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/376439964/User+Defined+Java+Expression)                                                                                                      | Scripting      | Calculate the result of a Java Expression using Janino.                                                                                                                  |
| [Value Mapper](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558180/Value+Mapper)                                                                                                                                      | Transform      | Map values of a certain field from one value to another.                                                                                                                 |
| [Vertica Bulk Loader](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311122/Vertica+Bulk+Loader)                                                                                                                        | Bulk loading   | Bulk load data into a Vertica table using their high performance COPY feature.                                                                                           |
| [Web services lookup](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372703383/Web+services+lookup)                                                                                                                        | Lookup         | Look up information using web services (WSDL).                                                                                                                           |
| [Write metadata to HCP objects](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/write-metadata-to-hcp)                                                                                         | Output         | Write custom metadata fields to a Hitachi Content Platform object.                                                                                                       |
| Write metadata                                                                                                                                                                                                                            | Deprecated     | Add new metadata to metadata in the Pentaho Data Catalog that is associated with specific data resources.                                                                |
| [Write to log](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/386806559/Write+to+log+step)                                                                                                                                 | Utility        | Write data to log.                                                                                                                                                       |
| [XBase input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/371558120/XBase+Input)                                                                                                                                        | Input          | Read records from an XBase type of database file (DBF).                                                                                                                  |
| [XML Input Stream (StAX)](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/xml-input-stream-stax)                                                                                               | Input          | Process very large and complex XML files very fast.                                                                                                                      |
| [XML Join](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/370967515/XML+Join)                                                                                                                                              | Joins          | Join a stream of XML-Tags into a target XML string.                                                                                                                      |
| [XML Output](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/xml-output-cp)                                                                                                                    | Output         | Write data to an XML file.                                                                                                                                               |
| [XSD Validator](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081877/XSD+Validator)                                                                                                                                    | Validation     | Validate XML source (files or streams) against XML Schema Definition.                                                                                                    |
| [XSL Transformation](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/372081880/XSL+Transformation)                                                                                                                          | Transform      | Transform XML stream using XSL (eXtensible Stylesheet Language).                                                                                                         |
| [Yaml Input](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388311572/Yaml+Input)                                                                                                                                          | Input          | Read YAML source (file or stream) parse them and convert them to rows and writes these to one or more output.                                                            |
| [Zip File](https://pentaho-community.atlassian.net/wiki/spaces/EAI/pages/388302923)                                                                                                                                                       | Utility        | Create a standard ZIP archive from the data stream fields.                                                                                                               |
