# Steps using Dataset tuning options

As part of Spark tuning, you can use the Dataset tuning options with the following steps.

| Step category  | Step name                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Agile          | <ul><li>MonetDB Agile Mart</li><li>Table Agile Mart</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| Big Data       | <ul><li>Avro output</li><li>Cassandra Input</li><li>Cassandra Output</li><li>CouchDB Input</li><li>Hadoop file output</li><li>HBase output</li><li>HBase row decoder</li><li>MapReduce Input</li><li>MapReduce Output</li><li>MongoDB Input</li><li>MongoDB Output</li><li>Orc output</li><li>Parquet output</li><li>SSTable Output</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| Bulk loading   | <ul><li>ElasticSearch Bulk Insert</li><li>Greenplum Load</li><li>Infobright Loader</li><li>Ingres VectorWise Bulk Loader</li><li>MonetDB Bulk Loader</li><li>MySQL Bulk Loader</li><li>Oracle Bulk Loader</li><li>PostgresSQL Bulk Loader</li><li>SAP HANA Bulk Loader</li><li>Teradata Fastload Bulk Loader</li><li>Teradata TPT Insert Upsert Bulk Loader</li><li>Vertica Bulk Loader</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| Cryptography   | <ul><li>Decrypt files with PGP</li><li>Encrypt Files with PGP</li><li>Secret Key Generator</li><li>Symmetric Cryptography</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| Data Mining    | <ul><li>AARF Output</li><li>Knowledge Flow</li><li>Weka Forecasting</li><li>Weka Scoring</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| Data Warehouse | <ul><li>Combination lookup/update</li><li>Dimension lookup/update</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| Deprecated     | <ul><li>Aggregate Rows</li><li>Example plugin</li><li>Get previous row fields</li><li>Greenplum Bulk loader</li><li>IBM WebSphere MQ Consumer</li><li>IBM WebSphere MQ Producer</li><li>JMS Consumer (deprecated)</li><li>JMS Producer (deprecated)</li><li>LucidDB Bulk Loader</li><li>LucidDB Streaming Loader</li><li>OpenERP Object Delete</li><li>OpenERP Object Input</li><li>OpenERP Object Output</li><li>Palo Cell Input</li><li>Palo Cell Output</li><li>Palo Dimension Input</li><li>SAP Input</li><li>Text file output (deprecated)</li></ul>                                                                                                                                                                                                                                                                                                                                                                                   |
| Experimental   | <ul><li>Script</li><li>SFTP Put</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| Flow           | <ul><li>Abort</li><li>Annotate stream</li><li>Blocking step</li><li>Detect empty stream</li><li>Dummy</li><li>ETL metadata injection</li><li>Filter rows</li><li>Identify last row in a stream</li><li>Java filter</li><li>Shared Dimension</li><li>Switch / Case</li><li>Transformation executor</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| Inline         | <ul><li>Injector</li><li>Socket reader</li><li>Socket writer</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| Input          | <ul><li>CSV file input</li><li>Data Grid</li><li>De-serialize from file</li><li>Email messages input</li><li>ESRI Shapefile Reader</li><li>Fixed file input</li><li>Generate random credit card numbers</li><li>Generate random value</li><li>Generate rows</li><li>Get data from XML</li><li>Get File Names</li><li>Get File Rows Count</li><li>Get repository names</li><li>Get SubFolder names</li><li>Get System Info</li><li>Get table names</li><li>Google Analytics</li><li>Google Docs Input</li><li>GZIP CSV Input</li><li>HL7 Input</li><li>JMS Consumer</li><li>JSON Input</li><li>LDAP Input</li><li>LDIF Input</li><li>Load file content in memory</li><li>Microsoft Access Input</li><li>Microsoft Excel Input</li><li>Mondrian Input</li><li>OLAP Input</li><li>Property Input</li><li>RSS Input</li><li>Salesforce Input</li><li>SAS Input</li><li>XBase Input</li><li>XML Input Stream (StAX)</li><li>Yaml Input</li></ul> |
| Job            | <ul><li>Copy rows to result</li><li>Get files from result</li><li>Get rows from result</li><li>Get Session Variables</li><li>Set files in result</li><li>Set Session Variables</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| Joins          | <ul><li>Join rows</li><li>Merge join</li><li>Merge rows (diff)</li><li>Multiway Merge Join</li><li>Sorted Merge</li><li>XML Join</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| Lookup         | <ul><li>Call DB Procedure</li><li>Check if a column exists</li><li>Check if file is locked</li><li>Check if webservice is available</li><li>Database join</li><li>Database lookup</li><li>Dynamic SQL row</li><li>File exists</li><li>Fuzzy match</li><li>HTTP client</li><li>HTTP Post</li><li>MaxMind GeoIP Lookup</li><li>REST Client</li><li>Stream lookup</li><li>Table exists</li><li>Web services lookup</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| Mapping        | <ul><li>Mapping</li><li>Mapping input specification</li><li>Mapping output specification</li><li>Simple mapping</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| N/A            | <ul><li>Spark Special - FileInputResolver</li><li>Spark Special - GenericSparkOperation</li><li>Spark Special - RecordsFromStreamSparkOperation</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| Output         | <ul><li>Automatic Documentation Output</li><li>Delete</li><li>Insert / Update</li><li>JSON output</li><li>LDAP Output</li><li>Microsoft Access Output</li><li>Microsoft Excel Output</li><li>Microsoft Excel Writer</li><li>Pentaho Reporting Output</li><li>Properties Output</li><li>RSS Output</li><li>Salesforce Delete</li><li>Salesforce Insert</li><li>Salesforce Update</li><li>Salesforce Upsert</li><li>Serialize to file</li><li>SQL File Output</li><li>Synchronize after merge</li><li>Table output</li><li>Text file output</li><li>Update</li><li>XML Output</li></ul>                                                                                                                                                                                                                                                                                                                                                       |
| Pentaho Server | <ul><li>Call Endpoint</li><li>Get Session Variables</li><li>Set Session Variables</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| Scripting      | <ul><li>Execute row SQL script</li><li>Execute SQL script</li><li>Formula</li><li>Modified Java Script Value</li><li>Python Executor</li><li>Regex Evaluation</li><li>Rule Accumulator</li><li>Rule Executor</li><li>User Defined Java Class</li><li>User Defined Java Expression</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| Statistics     | <ul><li>Analytic Query</li><li>Group by</li><li>Memory group by</li><li>Output steps metrics</li><li>R script executor</li><li>Reservoir Sampling</li><li>Sample rows</li><li>Univariate Statistics</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| Streaming      | <ul><li>AMQP Producer</li><li>JMS Producer</li><li>Kafka Producer</li><li>Get records from stream</li><li>MQTT producer</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| Transform      | <ul><li>Add a Checksum</li><li>Add constants</li><li>Add sequence</li><li>Add value fields changing sequence</li><li>Add XML</li><li>Calculator</li><li>Closure Generator</li><li>Concat Fields</li><li>Get ID from slave server</li><li>Number range</li><li>Replace in string</li><li>Row denormaliser</li><li>Row flattener</li><li>Row Normaliser</li><li>Select values</li><li>Set field value</li><li>Set field value to a constant</li><li>Sort rows</li><li>Split field to rows</li><li>Split Fields</li><li>Splunk Input</li><li>Splunk Output</li><li>String operations</li><li>Strings cut</li><li>Unique rows</li><li>Unique rows (Hashset)</li><li>Value Mapper</li><li>XSL Transformation</li></ul>                                                                                                                                                                                                                           |
| Utility        | <ul><li>Change file encoding</li><li>Clone row</li><li>Delay row</li><li>Edi to XML</li><li>Execute a process</li><li>If field value is null</li><li>Mail</li><li>Metadata structure of stream</li><li>Null if...</li><li>Process files</li><li>Run SSH commands</li><li>Send messge to Syslog</li><li>Table Compare</li><li>Write to log</li><li>Zip File</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| Validation     | <ul><li>Credit card validator</li><li>Data Validator</li><li>Mail Validator</li><li>XSD Validator</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pentaho.com/install/9.3-install/pentaho-configuration/tasks-to-be-performed-by-an-it-administrator/set-up-the-adaptive-execution-layer-ael/advanced-topics/spark-tuning-landing-page-cp/dataset-tuning-options-spark/steps-using-dataset-tuning-options.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
