Before you begin

Before you begin to set up Pentaho to connect to CDP, you must perform the following tasks:

  1. Check the Components Reference to verify that your Pentaho version supports your version of CDP.

  2. Prepare your CDP by performing the following tasks:

    1. Configure your Cloudera Data Platform.

      See CDP's documentationarrow-up-right if you need help.

    2. Install any required services and service client tools.

    3. Test the platform.

  3. Contact your platform administrator for connection information to CDP and services that you intend to use. Some of this information may be from Cloudera Manager or other management tools. You also need to supply some of this information to users after you are finished.

  4. Add the YARN user on the platform to the group defined by the dfs.permissions.superusergroup property. The dfs.permissions.superusergroup property can be found in the hdfs-site.xml file on your platform or in the Cloudera Manager.

  5. Set up Pentaho to connect to a Hadoop cluster. You need to install the driver for your version of CDP.

Set up a secured instance of CDP

If you are connecting to CDP secured with Kerberos, also perform the following tasks:

  1. Configure Kerberos security on the platform, including the Kerberos Realm, Kerberos KDC, and Kerberos Administrative Server.

  2. Configure the following items to accept remote connection requests:

    • Name

    • Data

    • Secondary

    • Job tracker

    • Task tracker nodes

  3. If you have deployed CDP using an enterprise-level program, set up Kerberos for name, data, secondary name, job tracker, and task tracker nodes.

  4. Add user account credentials to the Kerberos database for each Pentaho user that needs access to CDP.

  5. Verify that an operating system user account exists on each node in CDP for each user you want to add to the Kerberos database. Add operating system user accounts if necessary.

    circle-info

    User account UIDs should be greater than the minimum user ID value (min.user.id). Usually, the minimum user ID value is set to 1000.

  6. Set up Kerberos on your Pentaho machines. See the Administer Pentaho Data Integration and Analytics document for instructions.

Last updated

Was this helpful?