Method 1: Load the data to HDFS before running the transform
Run a different transformation using the Pentaho engine to move the data to the HDFS cluster.
Then use HDFS Input to run the transformation using the Spark engine.
Last updated
Was this helpful?