# Job Setup tab

![Pentaho MapReduce Job setup tab](/files/Ovs9DaREYaTme0r4NwIY)

The following table describes the options for setting up the inputs and outputs of the job:

| Option                            | Definition                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| --------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Input path**                    | Enter the path of the input directory, such as `/wordcount/input`, from your Hadoop cluster where the source data for the MapReduce job is stored. A comma-separated list can be used for multiple input directories.If you want to input from S3 storage, then you must use the S3A connector with the s3a:// protocol. Connectors "s3" and "s3n" are not supported. See [Hadoop documentation](https://cwiki.apache.org/confluence/display/HADOOP2/AmazonS3) for details. |
| **Output path**                   | Enter the path of the directory, such as `/wordcount/output`, on your Hadoop cluster where you want the output from the MapReduce job to be stored. The output directory cannot exist prior to running the MapReduce job.To specify S3 storage as the destination, you must use the S3A connector with the s3a:// protocol.                                                                                                                                                 |
| **Remove output path before job** | Select to remove the specified output path before the MapReduce job is scheduled.**Note:** This option is not for use with S3. If you need to clean the output path for S3 destinations, use an alternative entry, such as [Delete folders](https://pentaho-public.atlassian.net/wiki/spaces/EAI/pages/372703488/Delete+folders), to clear the output folder.                                                                                                               |
| **Input format**                  | Enter the Apache Hadoop class name that describes the input specification for the MapReduce job. See [InputFormat](http://hadoop.apache.org/docs/stable/api/index.html?org/apache/hadoop/mapreduce/lib/input/FileInputFormat.html) for more information.                                                                                                                                                                                                                    |
| **Output format**                 | Enter the Apache Hadoop class name that describes the output specification for the MapReduce job. See [OutputFormat](http://hadoop.apache.org/docs/stable/api/index.html?org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.html)for more information.                                                                                                                                                                                                                 |
| **Ignore output of map key**      | Select to ignore the key output from the mapper transformation and replace it with `NullWritable`.                                                                                                                                                                                                                                                                                                                                                                          |
| **Ignore output of map value**    | Select to ignore the value output from the mapper transformation and replace it with `NullWritable`.                                                                                                                                                                                                                                                                                                                                                                        |
| **Ignore output of reduce key**   | Select to ignore the key output from the combiner and/or reducer transformations and replace them with `NullWritable`. This requires a reducer transformation to be used, not the Identity Reducer.                                                                                                                                                                                                                                                                         |
| **Ignore output of reduce value** | Select to ignore the key output from the combiner and/or reducer transformations and replace them with `NullWritable`. This requires a reducer transformation to be used, not the Identity Reducer.                                                                                                                                                                                                                                                                         |


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pentaho.com/pdia-data-integration/10.2-data-integration/pdi-job-entries-reference-overview/pentaho-mapreduce/options-pentaho-mapreduce-job/job-setup-tab.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
