Hadoop to PDI data type conversion
The Hadoop Job Executor and Pentaho MapReduce steps have an advanced configuration mode that enables you to specify data types for the job's input and output. PDI is unable to detect foreign data types on its own; therefore you must specify the input and output data types in the Job Setup tab.
This table explains the relationship between Hadoop data types and their PDI equivalents.
java.lang.Integer
org.apache.hadoop.io.IntWritable
java.lang.Long
org.apache.hadoop.io.IntWritable
java.lang.Long
org.apache.hadoop.io.LongWritable
org.apache.hadoop.io.IntWritable
java.lang.Long
java.lang.String
org.apache.hadoop.io.Text
java.lang.String
org.apache.hadoop.io.IntWritable
org.apache.hadoop.io.LongWritable
org.apache.hadoop.io.Text
org.apache.hadoop.io.LongWritable
java.lang.Long
For more information on configuring Pentaho MapReduce to convert to additional data types, see Pentaho MapReduce.
Last updated
Was this helpful?