# Hadoop to PDI data type conversion

The Hadoop Job Executor and Pentaho MapReduce steps have an advanced configuration mode that enables you to specify data types for the job's input and output. PDI is unable to detect foreign data types on its own; therefore you must specify the input and output data types in the **Job Setup** tab.

This table explains the relationship between Hadoop data types and their PDI equivalents.

| PDI (Kettle) Data Type               | Apache Hadoop Data Type             |
| ------------------------------------ | ----------------------------------- |
| `java.lang.Integer`                  | `org.apache.hadoop.io.IntWritable`  |
| `java.lang.Long`                     | `org.apache.hadoop.io.IntWritable`  |
| `java.lang.Long`                     | `org.apache.hadoop.io.LongWritable` |
| `org.apache.hadoop​.io.IntWritable`  | `java.lang.Long`                    |
| `java.lang.String`                   | `org.apache.hadoop.io.Text`         |
| `java.lang.String`                   | `org.apache.hadoop​.io.IntWritable` |
| `org.apache.hadoop.io​.LongWritable` | `org.apache.hadoop.io​.Text`        |
| `org.apache.hadoop.io​.LongWritable` | `java.lang.Long`                    |

For more information on configuring Pentaho MapReduce to convert to additional data types, see [Pentaho MapReduce.](/pdia-data-integration/10.2-data-integration/pdi-job-entries-reference-overview/pentaho-mapreduce.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pentaho.com/pdia-data-integration/10.2-data-integration/pdi-job-entries-reference-overview/pentaho-mapreduce/use-pdi-outside-and-inside-the-hadoop-cluster/hadoop-to-pdi-data-type-conversion.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
