# Cluster tab

![Cluster tab, Pentaho MapReduce](/files/d33pWuU1wCiWinGwSYaL)

The following table describes the options for setting up configurations for the Hadoop cluster connection:

| Option                      | Definition                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| --------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Hadoop job name**         | Enter the name of the Hadoop job you are running. It is required for the Pentaho MapReduce entry to work.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| **Hadoop Cluster**          | <p>Specify the configuration of your Hadoop cluster through the following options:- Select an existing configuration. If your configuration does not appear in this list, create it with the <strong>New</strong> button.</p><ul><li>Click <strong>Edit</strong> to use the Hadoop cluster dialog box to modify an existing configuration. See <a href="/pages/pAzzVzwCU5XRZ9ANZ1SU">Connecting to a Hadoop cluster with the PDI client</a> for further details on this dialog box.</li><li>Click <strong>New</strong> to use the Hadoop cluster dialog box to create a new configuration. See the <a href="/pages/pAzzVzwCU5XRZ9ANZ1SU">Connecting to a Hadoop cluster with the PDI client</a> section for further details on this dialog box.</li></ul><p>See the <strong>Install Pentaho Data Integration and Analytics</strong> document for general information on Hadoop cluster configurations.</p> |
| **Number of Mapper Tasks**  | Enter the number of mapper tasks you want to assign to this job. The size of the inputs should determine the number of mapper tasks. Typically, there should be between 10-100 maps per node, though you can specify a higher number for mapper tasks that are not CPU-intensive.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| **Number of Reducer Tasks** | Enter the number of reducer tasks you want to assign to this job. Lower numbers mean that the reduce operations can launch immediately and start transferring map outputs as the maps finish. The higher the number, the quicker the nodes will finish their first round of reduces and launch a second round. Increasing the number of reduce operations increases the Hadoop framework overhead, but improves load balancing.**Note:** If this is set to `0`, then no reduce operation is performed, and the output of the mapper becomes the output of the entire job. Combiner operations will also not be performed.                                                                                                                                                                                                                                                                                  |
| **Logging Interval**        | Enter the number of seconds between log messages.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| **Enable Blocking**         | Select to forces the job to wait until each step completes before continuing to the next step. This is the only way for PDI to be aware of a Hadoop job's status.**Note:** If this option is not selected, the Hadoop job blindly executes, and PDI will move on to the next job entry. Error handling and routing will not work unless this option is selected.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pentaho.com/pdia-data-integration/10.2-data-integration/pdi-job-entries-reference-overview/pentaho-mapreduce/options-pentaho-mapreduce-job/cluster-tab.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
