To copy files in these instructions, use either the Hadoop Copy Files job entry or Hadoop command line tools.
Perform the following steps to modify the sample Spark job and understand how a Spark Submit entry works in PDI:
Copy a text file that contains words that you would like to count to the HDFS on your cluster.
Start the PDI client.
Open the Spark Submit.kjb job, which can be found in the design-tools/data-integration/samples/jobs/Spark Submit folder.
Spark Submit.kjb
design-tools/data-integration/samples/jobs/Spark Submit
Select File > Save As, and then rename and save the file as Spark Submit Sample.kjb.
Spark Submit Sample.kjb
The Spark Submit Sample.kjb file is saved to the jobs folder.
jobs
Last updated 8 months ago
Was this helpful?