Job Setup tab

The following table describes the options for setting up the inputs and outputs of the job:
Option
Definition
Input path
Enter the path of the input directory, such as /wordcount/input
, from your Hadoop cluster where the source data for the MapReduce job is stored. A comma-separated list can be used for multiple input directories.
Output path
Enter the path of the directory, such as /wordcount/output
, on your Hadoop cluster where you want the output from the MapReduce job to be stored.
Note: The output directory cannot exist prior to running the MapReduce job.
Remove output path before job
Select to remove the specified output path before the MapReduce job is scheduled.
Input format
Enter the Apache Hadoop class name that describes the input specification for the MapReduce job. See InputFormat for more information.
Output format
Enter the Apache Hadoop class name that describes the output specification for the MapReduce job. See OutputFormatfor more information.
Ignore output of map key
Select to ignore the key output from the mapper transformation and replace it with NullWritable
.
Ignore output of map value
Select to ignore the value output from the mapper transformation and replace it with NullWritable
.
Ignore output of reduce key
Select to ignore the key output from the combiner and/or reducer transformations and replace them with NullWritable
. This requires a reducer transformation to be used, not the Identity Reducer.
Ignore output of reduce value
Select to ignore the key output from the combiner and/or reducer transformations and replace them with NullWritable
. This requires a reducer transformation to be used, not the Identity Reducer.
Last updated
Was this helpful?