Method 1: Load the data to HDFS before running the transform

  1. Run a different transformation using the Pentaho engine to move the data to the HDFS cluster.

  2. Then use HDFS Input to run the transformation using the Spark engine.

Last updated