# PDI Hadoop job workflow

PDI enables you to execute a Java class from within a PDI client job to perform operations on Hadoop data. The way you approach doing this is similar to the way would for any other PDI job. The specifically-designed job entry that handles the Java class is Hadoop Job Executor. In this illustration it is used in the WordCount - Advanced entry.

![Hadoop Job Executor Workflow](/files/cnsx9Czi6EXaJsKNua79)

TheHadoop Job Executor dialog box enables you to configure the entry with a `.jar` file that contains the Java class.

![Hadoop Job Executor dialog](/files/0HDVasgVCFZUTr2ANlQR)

If you are using the Amazon Elastic MapReduce (EMR) service, you can use the Amazon EMR Job Executor job entry to execute the Java class. This differs from the standard Hadoop Job Executor in that it contains connection information for Amazon S3 and configuration options for EMR.

![Amazon EMR Job Executor job entry](/files/2lrvfdQgQe5dKJoib34a)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pentaho.com/pdia-data-integration/10.2-data-integration/pdi-job-entries-reference-overview/pentaho-mapreduce/use-pdi-outside-and-inside-the-hadoop-cluster/pdi-hadoop-job-workflow.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
