Use the Spark Submit job entry with an external Spark script to run Spark jobs on YARN clusters.
This example shows how to submit a Spark job from PDI.
If you use Spark Submit with Kerberos-secured Cloudera CDP, see Use Kerberos with Spark Submit in the Administer Pentaho Data Integration and Analytics documentation.
Before you begin
Install and configure the Spark client. Follow the Spark Submit job entry instructions in the Pentaho Data Integration documentation.