Method 1: Load the data to HDFS before running the transform

  1. Run a different transformation using the Pentaho engine to move the data to the HDFS cluster.

  2. Then use HDFS Input to run the transformation using the Spark engine.

Last updated

Was this helpful?