Secure the Pentaho system
Lock down access to the Pentaho Server first. Then secure Hadoop and CDP integrations.
User security
You can choose from two different user security options: Pentaho Security or advanced security providers (such as LDAP, Single Sign-On, or Microsoft Active Directory).
For the Pentaho User Console (PUC), your predefined users and roles can be used if you are already using a security provider such as Lightweight Directory Access Protocol (LDAP), Microsoft Active Directory (MSAD), or Single Sign-On (SSO). Pentaho Data Integration (PDI) can also be configured to use your implementation of these providers or Kerberos to authenticate users and authorize data access.
These articles guide you through the process of configuring third-party security frameworks for the Pentaho Server.
Note: If you are evaluating Pentaho or have a production environment with fewer than a hundred users, you may decide to use Pentaho default security. See the Install Pentaho Data Integration and Analytics document for details.
Before you can implement advanced security, you must have installed and configured the Pentaho Server. You should have administrative-level knowledge of the security provider you want to use, details about your user community, and a plan for the user roles to be used in PDI. You should also know how to use the command line to issue commands for Microsoft Windows or Linux.
PUC can be used to perform most security tasks pertaining to the console. For some cases with PDI, you will need a text editor to modify text files. Some of these security tasks also require that you work on the actual machine where the Pentaho Server is installed.
All of the tasks that use the Administration page in PUC require that you log on to the User Console with the Pentaho administrator user name and password.
For information on the two different user security options you can choose from, see the following sections:
Pentaho Server security
Pentaho Security is a quick way to configure security. It works well without a security provider. It also works well for communities under 100 users.
The Pentaho User Console (PUC) lets you define security by users and roles. The Pentaho Server controls which users and roles can access web resources and repository content.
Hiding user folders in PUC and PDI
One way you can centralize and secure content created by users is to hide individual users' Home folders in the Pentaho User Console (PUC) or in the PDI client. For example, if your organization implements multi-tenancy, you may want to prevent individual users from viewing their Home folders for security reasons.
You can configure your server to hide the Home folders by default for both PUC and PDI. When you create new users in your system, their Home folders will be hidden. If a user needs to create, edit, or save content, you can provide the Write permission in a folder that is visible to that user. Those users can then view the folder and access the content. You can add the Write permission in PUC and in the PDI client.
These tasks assume you are a Pentaho Administrator.
Perform the following steps to edit the system.properties file so that when you create new users, their Home folders will be hidden by default.
Stop the Pentaho Server.
See the Install Pentaho Data Integration and Analytics document for instructions on starting and stopping the Pentaho Server.
Navigate to
/Pentaho/server/pentaho-server/pentaho-solutions/systemand opensystem.propertiesin a text editor.Locate the
hideUserHomeFolderOnCreateproperty. By default, this property is set tofalse.Change the setting to
true:hideUserHomeFolderOnCreate=trueSave and close
system.properties.Start the Pentaho Server.
See the Install Pentaho Data Integration and Analytics document for instructions on starting and stopping the Pentaho Server.
Now when you add a new user in either PUC or the PDI client, that user's Home folder is hidden by default.
Next steps:
To override this setting for a specific user, see Override the hidden Home folder for a user.
To stop hiding Home folders by default when you create new users, see Stop hiding the Home folder for new users.
If a user with a hidden Home folder needs to create, edit, or save content, grant Write permission using PUC or the PDI client.
Override the hidden Home folder for a user
Follow these steps to override the hidden Home folder for a specific user.
Log in to PUC with your Pentaho Administrator credentials.
Go to Browse Files.
Select the user’s Home folder.
In the Properties dialog box, clear Hidden.
The user's Home folder is now visible.
Stop hiding the Home folder for new users
Follow these steps to stop creating users with their Home folders hidden by default.
Stop the Pentaho Server.
Navigate to
/Pentaho/server/pentaho-server/pentaho-solutions/systemand opensystem.propertiesin a text editor.Locate the
hideUserHomeFolderOnCreateproperty.Change the setting to
false:hideUserHomeFolderOnCreate=falseSave and close
system.properties.Start the Pentaho Server.
When you add a new user in either PUC or the PDI client, that user's Home folder is now visible.
See the Install Pentaho Data Integration and Analytics document for instructions on starting and stopping the Pentaho Server.
Assign the Write permission to a user folder in PUC
In PUC you can assign Write permission in a public folder to a user whose Home folder is hidden. When this permission is granted, the user can save and edit content they create using PUC.
Log in to PUC with your Pentaho Administrator credentials.
Select the Public folder, and then select or create the folder you want the user to access.
Assign Write permission:
Click Properties > Share and clear Inherits folder permissions.
Click Add and select the user.
Select Write for the user.
Click OK to save your changes.
The user can now save content in the assigned folder.
Assign the Write permission to a user folder in the PDI client
In the PDI client you can assign the Write permission in a public folder to a user whose Home folder is hidden. When this permission is granted, the user can save and edit content they create using the PDI client.
Start the PDI client.
See the Pentaho Data Integration document for instructions on starting the PDI client.
Connect to a Pentaho Repository with your Pentaho Administrator credentials.
Open the Repository Explorer: Tools > Repository > Explore.
On the Browse tab, select the Public folder, and then select or create the folder you want the user to access.
Select the folder and grant Write permission:
On the Access Control panel, clear Inherit access control from parent.
Click the Plus sign to add a user.
Select the user and move them to Selected. Click OK.
Select the user in User/Role and select Write.
Click Apply, then click OK.
Close the Repository Explorer.
The user can now save and edit content in the assigned folder.
Restrict or share files and folders
Access to files or folders can be refined using the Pentaho User Console. Each file or folder can either use the default permissions or you can tailor them for specific users and roles.
Prior to performing this task, determine whether you will use the default Pentaho roles or create custom users and roles. You must also have successfully set up your security back end.
Log in to the User Console using the administrator role.
From Browse Files, choose the folder you want to set permissions on from the Folders pane.
If you want to set permissions on a specific file in that folder, highlight the file in the Files pane.
Click Properties in the Actions pane.
The Properties window appears.
On the Share tab, select the Role that you want to set permissions for. Then clear Inherits folder permissions.
Permissions for [Role] becomes available.
Select permissions for that role, then click OK.
The permissions are set for that file or folder and are associated with the selected role.
For additional security in multi-tenancy organizations, you can hide individual users' Home folders. See Hiding user folders in PUC and PDI.
Pass authentication credentials in URL parameters
This section is currently a placeholder. Add the approved guidance for passing credentials in URL parameters.
Remove Pentaho Server security
You can remove Pentaho Server security by enabling anonymous access or by modifying data source management.
Enable anonymous access
You can bypass built-in security on the Pentaho Server by giving all permissions to anonymous users. An "anonymousUser" is any user, either existing or newly created, that you specify as an all-permissions, no-login user, and to whom you grant the Anonymous role.
This procedure grants full Pentaho Server access to the Anonymous role. It also removes the login requirement.
All of the files you will be using are located in /pentaho/server/pentaho-server/pentaho-solutions/system. Before you begin, stop the Pentaho Server.
Modify application security
Perform the following steps to modify application security:
Open
applicationContext-spring-security.xmlin a text editor.Make sure a default anonymous role is defined. Match your bean definition and property value to the following example:
Note: These next steps permit PDI client tools to publish to the Pentaho Server without a user name and password.
Find these two beans in the same file:
filterInvocationInterceptorfilterInvocationInterceptorForWS
Locate the
securityMetadataSourceproperty in the beans and match the contents to the following example:Save and close
applicationContext-spring-security.xml.
Modify Pentaho configuration
Perform the following steps to modify the Pentaho configuration:
Open
pentaho.xmlin a text editor.Find the
anonymous-authenticationsection underpentaho-system, and define the anonymous user and role as shown in the following example:Save and close
pentaho.xml.
Modify repository properties
Perform the following steps to modify the repository properties:
Open
repository-spring.propertiesin a text editor.Find
singleTenantAdminAuthorityNameand replace the value withAnonymous.Find
singleTenantAdminUserNameand replace the value with your anonymous user name.Save and close the file.
Map the appropriate role
Perform the following steps to map roles:
Find all references to the bean
id="Mondrian-UserRoleMapper". Make sure the only active mapper is the one shown in the following example:If you changed
pentahoObjects.spring.xml, save and close the file.
You have now worked around Pentaho Server security. If you use the relational metadata database model, refer to Remove Security from Metadata Domain Repository for the next steps.
Remove security from data source management
This procedure changes your data source management so that an anonymous user can access it. These steps are necessary to completely remove security from the Pentaho Server. However, this procedure does not remove all security. If you need to remove all security, enable anonymous access as described above.
Perform the following steps to completely remove security from the Pentaho Server:
Stop the Pentaho Server (if needed).
Open
/pentaho/server/pentaho-server/pentaho-solutions/system/data-access/settings.xmlin a text editor.Find
<data-access-roles>Administrator</data-access-roles>and changeAdministratortoAnonymous.Find
<data-access-view-roles>Authenticated,Administrator</data-access-view-roles>and changeAuthenticated,AdministratortoAnonymous.Find
<data-access-view-users>suzy</data-access-view-users>and changesuzytoanonymousUser.Find
<data-access-datasource-solution-storage>admin</data-access-datasource-solution-storage>and changeadmintoanonymousUser.
Save and close the file.
Restart the Pentaho Server.
Advanced security providers
Use these options when the default Pentaho security provider is not enough.
This section consolidates configuration guidance for common advanced providers and related hardening tasks.
Spring (authentication providers) security
Spring security is a cascading security implementation that moves down through a list of authentication providers. If the first provider fails to authenticate, then the application looks to the next provider in the list to authenticate. If you are using multiple AuthenticationProviders at the same time, you must add each security provider to the applicationContext.spring.security.xml file. You must also add provider name values to the activeUserDetailsService beans in the pentahoObjects.spring.xml file. We recommend that you make a backup of these files before altering them.
ApplicationContext
Perform the following steps to add security providers to the ApplicationContext:
Stop the Pentaho Server and the solution repository.
Navigate to the
/pentaho-solutions/systemdirectory and open theapplicationContext-spring-security.xmlfile with any text editor.Locate the following
authenticationManagerbean tags:Add your AuthenticationProvider information below the list tag. The example below adds the
jackrabbitprovider:Then, add providerName information right beneath the
jackrabbitinformation.LDAPis used in this example. You can add as many providers as needed:After you are finished adding AuthenticationProvider information, save and close the file.
The following code block is a more complete example of the authenticationManager portion of the applicationContext-spring-security.xml file:
Add the Jackrabbit provider
The Jackrabbit provider is required in the activeUserDetailsService bean, even if you configure another provider. Perform the following steps to add the Jackrabbit provider to the activeUserDetailsService bean:
Navigate to the
/pentaho-solutions/systemdirectory and open thepentahoObjects.spring.xmlfile with any text editor.Locate the
activeUserDetailsServicebean tag:Replace ${security.provider} with the
jackrabbitprovider value. For example:
Add another provider
Perform the following steps to add more provider names:
Duplicate the
activeUserDetailsServicebean shown in Substep 2 of the Add the Jackrabbit Provider section.Rename the bean ID, for example:
bean id="activeUserDetailsService2"Replace the
jackrabbitvalue with the new provider value. For example:Locate the following
UserDetailsServicebean tags:Add your bean ID to the list element. For example:
Restart the Pentaho Server and solution repository.
Configure authentication
To configure Web resource authentication to correspond with your user roles in the Pentaho Server, perform the following instructions.
Ensure that the Pentaho Server is not currently running; if it is, run the
stop-pentahoscript.Open a Terminal or Command Prompt window and navigate to the
.../pentaho-solutions/system/directory.Edit the
applicationContext-spring-security.xmlfile with a text editor.Find and examine the following property: <property name="objectDefinitionSource">
Modify the regex patterns to include your roles.
The objectDefinitionSource property associates URL patterns with roles. RoleVoter specifies that if any role on the right hand side of the equals sign is granted to the user, the user may view any page that matches that URL pattern. The default roles in this file are not required. You can replace, delete, or change them in any way that suits you.
You should now have coarse-grained permissions established for user roles.
Authentication provider examples
Jackrabbit
Default Pentaho security.
applicationContext-spring-security-jackrabbit.xml
LDAP
LDAP security
applicationContext-spring-security-ldap.xml
JDBC
JDBC security allows you to use your own security tables
applicationContext-spring-security-jdbc.xml
Memory
In memory authentication
applicationContext-spring-security-memory.xml
JDBC security
You must have existing security tables in a relational database in order to proceed with this task.
Switch to JDBC security
Follow the instructions below to switch from Pentaho default security to JDBC security, which will allow you to use your own security tables.
If you want to use encrypted passwords for JDBC security as explained in the Install Pentaho Data Integration and Analytics document, use the encrypted password for all password values.
If you are using the Pentaho Server and choose to switch to a JDBC security shared object, you will no longer be able to use the role and user administration settings in the Administration portion of the User Console.
Stop the Pentaho Server.
Open
/pentaho-solutions/system/security.propertieswith a text editor.Change the value of the provide property to
jdbc.Set up the connection to the database that holds the users and authorities:
Open the
/pentaho-solutions/system/applicationContext-spring-security-jdbc.propertiesfile with a text editor. Find the following two lines and change thejdbcDriverandURLto the appropriate values.Change the user name and password by editing the following two items:
Set the validation.query by editing its row. Examples of different validation queries are shown in the file.
Set the
wait timeout, max pool,andmax idleby editing the following three items to change the defaults.Save the file and close the editor.
If needed, modify the user queries that pull information about users and authorities:
Open
/pentaho-solutions/system/applicationContext-spring-security-jdbc.xmlwith a text editor.Find the following line and change the SQL query returning the user and roles for which the user is a member to the appropriate statement:
Find the following line and change the SQL query that determines the user, password, and whether they can log in to the appropriate statement:
If needed, modify the following role queries that pull information about users and authorities.
Open the
/pentaho-solutions/system/applicationContext-pentaho-security-jdbc.xmlfile with a text editor.Find the following line and change the SQL query showing the roles for security on objects to the appropriate statement:
Find the following line and change the SQL query that returns all users in a specific role to the appropriate statement:
Find the following line and change the SQL query that returns all users by order to the appropriate statement:
Save the file and close the editor.
Update the default Pentaho admin user on the system to map to your JDBC admin user:
Open the
/pentaho-solutions/system/repository.spring.propertiesfile with a text editor.Find the following lines and change the default value from <admin> to map to your <admin username> in your JDBC system:
Save the file and close the editor.
To fully map the JDBC's admin role to other configuration files, specify the name of the administrator role for your JDBC authentication database in the
applicationContext-pentaho-security-jdbc.xmlfile.Open the
/pentaho-solutions/system/applicationContext-pentaho-security-jdbc.xmlfile with a text editor.Find the following lines and change the entry key to the key assigned to the administrator role in your JDBC authentication database:
Save and close the file.
Start the Pentaho Server.
The server is configured to authenticate users against the specified database.
Manual LDAP/JDBC hybrid configuration
You might need to create a hybrid between an LDAP security solution and a JDBC security table for role definitions. This is common in situations where LDAP roles can't be redefined for Pentaho Server use. These instructions help you switch the Pentaho Server's authentication back-end from the Pentaho data access object to an LDAP/JDBC hybrid.
Before you begin configuring LDAP and JDBC for the Pentaho Server, you will need to verify a couple of things.
Verify Successful Default Pentaho Security Deployment
Make sure your Pentaho Server has been successfully deployed using default Pentaho Security (Jackrabbit authentication).
Configure Pentaho for LDAP Authentication
Verify that your Pentaho system is configured for LDAP authentication.
Verify Database with User Roles
Verify that you have a database populated with your user roles.
After you finish the prerequisite tasks above, there are a few things that you need to do in order set up a hybrid LDAP/JDBC configuration successfully. The table structure described here is for example purposes.
These sections will guide you through the remaining steps of this process:
Step 1: Create user/authorities database tables
Step 2: Set up inserts for tables
Step 3: Update JDBC security queries
Step 4: Enable JDBC authorization beans
Step 5: Verify LDAP/JDBC configuration
Step 1: Create user/authorities database tables
You will need to create a few database tables in order to get LDAP and JDBC to work together.
Create a table called
USERSand populate it with the following values:Column NameColumn TypeColumn Descriptionusername
VARCHAR(50)
The User name.
password
VARCHAR(50)
This column value is not considered in a hybrid LDAP/JDBC solution.
enables
VARCHAR(100)
Set to
trueif user is enables;falseif not enabled.Create a table called
AUTHORITIESand populate it with the following values:Column NameColumn TypeColumn Descriptionauthority
VARCHAR(50)
The Pentaho role, such as Administrator, Report Author, etc.
Create a table called
GRANTED_AUTHORITIESand populate it with the following values:Column NameColumn TypeColumn Descriptionusername
VARCHAR(50)
The User name.
authority
VARCHAR(5)
Associated Pentaho role.
Step 2: Update user and role values for tables
Next, you will need to perform a series of updates for the tables you just created. Users will be authenticated using their Active Directory password.
Note: Some syntax examples are provided here for you to customize with your own values.
Update usernames and passwords in the USERS table as shown:
Update roles in the AUTHORITIES table as shown:
Update users with their associated roles in the GRANTED_AUTHORITIES table as shown:
Step 3: Update JDBC security queries
You might have different names for your created tables than are provided in these examples. If so, after you have updated your user and role values in your tables, you need to update a couple of queries and other items to match your system names.
Locate the
/pentaho-server/pentaho-solutions/systemdirectory and update these two files with the noted information.applicationContext-pentaho-security-jdbc.xmlapplicationContext-spring-security-jdbc.xml
Update the query, as well as field names such as username, password, and enabled that are expected by spring framework security. Be sure to use an alias if you are using different field names.
Stop the Pentaho Server.
Copy your respective database JDBC driver to the
tomcat/libdirectory.See the JDBC drivers reference in the Try Pentaho Data Integration and Analytics document for information on supported drivers.
Step 4: Enable JDBC Authorization beans
Last, you will need to enable some JDBC Authorization beans.
Update security properties file
Perform the following steps to update the security.properties file:
Stop the Pentaho Server.
Locate the
pentaho-server/pentaho-solutions/systemdirectory.Open the
security.propertiesfile with any text editor.Locate the LDAP property bean and add the role provider as shown here:
Save and close the file.
Update spring security-jdbc properties file
Perform the following steps to update the applicationContext-spring-security-jdbc.properties file:
Open the
applicationContext-spring-security-jdbc.propertiesfile.Add or update this database information with your system values.
Database SettingDescriptiondatasource.driver.classname
Fully-qualified Java class name of the JDBC driver you are using.
datasource.url
Connection URL to be passed to your JDBC driver to establish a connection.
datasource.username
Connection username to be passed to our JDBC driver to establish a connection
datasource.password
Connection password to be passed to our JDBC driver to establish a connection
datasource.validation.query
SQL query that is used to validate connections from this pool before returning them to the caller. This query must be a SELECT statement that returns at least one row.
datasource.pool.max.wait
Maximum number of milliseconds that the pool will wait when there are no available connections. For a connection to be returned before throwing an exception, or
<= 0, to wait indefinitely. Default is -1.datasource.pool.max.active
Maximum number of active connections that can be allocated from this pool at the same time, or negative for no limit. Default value is 8.
datasource.max.idle
Maximum number of connections that can remain idle in the pool, without extra ones being destroyed, or negative for no limit. Default value is 8.
datasource.min.idle
Minimum number of active connections that can remain idle in the pool, without extra ones being created when the evictor runs, or
0to create none. Default value is 0.Save and close the file.
Update the pentaho-security-jdbc file
Perform the following steps to update the applicationContext-pentaho-security-jdbc.xml file:
Open the
applicationContext-pentaho-security-jdbc.xmlfile.Change the entry key to show your admin role for your database.
From:
To:
Save and close the editor.
Update the pentahoObjects spring file
Perform the following steps to update the pentahoObjects.spring.xml file:
Open the
pentahoObjects.spring.xmlfile.Change these beans as shown, then save and close the file.
From:
To:
Update spring security-ldap file
Perform the following steps to update the applicationContext-spring-security-ldap.xml file:
Open the
applicationContext-spring-security-ldap.xmlfile.Remove the bean for
org.springframework.security.ldap.populator.DefaultLdapAuthoritiesPopulatorand replace it with this one:Save and close the file.
Update the repository spring properties file
Perform the following steps to update the repository.spring.properties file:
Open the
repository.spring.propertiesfile.Locate the value for the
singleTenantAdminUserNameand make sure that it points to the correct admin user for your system.Restart the Pentaho Server.
Step 5: Verify LDAP/JDBC Configuration
Pentaho should now be successfully configured with hybrid authentication. Users are authenticated through LDAP, and the roles are authorized through JDBC. You can verify this by logging into PUC as an admin and checking in the Users & Roles tab in the Administration perspective.
SAML security
Security Assertion Markup Language (SAML) is a webbased authentication mechanism that relies on the browser as an agent to broker the authentication flow. There are numerous third-party Identity Providers (IdP) available, such as OpenSSO, OKTA, and SSOCircle.com.
The following diagram is a high-level sketch of a SAML identification structure containing a third-party Identity Provider (IdP), an End-User Browser (the Pentaho User Console), and a Service Provider (the Pentaho Server):

See the guidelines for SAML on the Support Portal for help in setting up an example instance of SAML security. If you want to extend your SAML set up further, please work with with your Customer Success Manager or contact Support.
LDAP security
To use Lightweight Directory Access Protocol (LDAP) for user security, you must switch from the default Pentaho security to LDAP, then you must configure LDAP.
Switch to LDAP
To connect to your LDAP server, you must import the certificate into the JRE's truststore/keystore used by the Pentaho Server (java/lib/security/cacerts).
From the User Console Home menu, click Administration, then select Authentication from the left.
The Authentication interface appears. Local - Use basic Pentaho Authentication is selected by default.
Select the External - Use LDAP / Active Directory server option.

User console authentication set to external The LDAP Server Connection fields populate with a default URL, user name, and password.
Change the Server URL, User Name, and Password as needed.
Click Test Server Connection to verify the connection to your LDAP server and to complete the set up.
Click the node to select the Pentaho System Administrator user and role to match your LDAP configuration, then click OK.
Note: The Admin user is required for all system-related operations, including the creation of user folders. The Administrator Role is required for mapping a third-party admin role to the Pentaho admin role (Administrator).
Select your LDAP Provider from the drop-down menu.
Configure the LDAP connection as explained in LDAP properties.
Stop the Pentaho Server.
See the Install Pentaho Data Integration and Analytics document for instructions on starting and stopping the Pentaho Server.
Delete the
server/pentaho-server/pentaho-solutions/system/karaf/cachesfolder.Restart the Pentaho Server and test the LDAP functionality.
See the Install Pentaho Data Integration and Analytics document for instructions on starting and stopping the Pentaho Server.
The Pentaho Server is now configured to authenticate users against your LDAP directory server.
Big data security
Secure Hadoop and CDP integrations in Pentaho.
Use Kerberos authentication or secure impersonation. Use Knox when you need a gateway.
Supported CDP security integrations
If you use Knox, see Use Knox to access CDP.
These systems are supported when using Pentaho with a secured CDP cluster.
HDFS
Supported
Supported
Avro
Supported
Supported
OFC
Supported
Not supported
Parquet
Supported
Not supported
YARN
Supported
Not supported
YARN-based Pentaho MapReduce (PMR)*
Supported
Not supported
Sqoop
Supported
Not supported
Hive
Supported
Supported
YARN-based HBase
Supported
Supported
YARN-based Hadoop job executor
Supported
Not supported
Oozie
Not supported
Supported
Impala
Supported
Supported
Spark submit
Supported
Not supported
* You must have permission to write to /opt in the HDFS root directory to run PMR in CDP Public Cloud with Kerberos.
Per CDP support requirements, all cluster nodes must run CentOS. You cannot customize it.
Choose an approach
Kerberos authentication validates each user directly against Kerberos.
Secure impersonation uses a service identity, then impersonates the Pentaho user.
Use secure impersonation when jobs run on the Pentaho Server.
Use Kerberos authentication when you need user-level access without impersonation.
Secure impersonation
The mapping value simple in the driver configuration file enables secure impersonation.
This value is set when you define impersonation settings in a named connection.
How secure impersonation works
At startup, the Pentaho Server checks the mapping type value in the configuration file:
If the value is disabled or blank, the server does not use authentication.
If the value is simple, requests are evaluated by origin.
Requests from a client tool use Kerberos authentication.
Requests from the Pentaho Server use secure impersonation when supported.
If the component does not support secure impersonation, Kerberos is used.
When impersonation succeeds, the Pentaho Server log shows:

Restart the server after changing the mapping type value.
Secure impersonation prerequisites
To use secure impersonation:
The cluster must be secured with Kerberos.
The Kerberos server must be reachable from the Pentaho Server.
Kerberos must be installed and configured on the Pentaho machine.
A cluster-side Kerberos principal must represent Pentaho.
That principal must be allowed to impersonate users.
Requests must originate from the Pentaho Server.
Target components must support secure impersonation.
The cluster administrator is responsible for cluster users and Kerberos server setup.
See the vendor and Hadoop security docs for details:
Secure impersonation supported components
Secure impersonation support is determined by the underlying Hadoop components.
Supported:
Cloudera-Impala
HBase
HDFS
Hadoop MapReduce
Hive
Oozie
Pentaho MapReduce (PMR)
You can securely connect to Hive and HBase within the mapper, reducer, or combiner.
Not supported:
Carte on Yarn
Impala
Sqoop
Spark SQL
Secure impersonation directly from these tools is not supported:
PDI client (Spoon)
Scheduled jobs and transformations
Pentaho Report Designer
Pentaho Metadata Editor
Kitchen
Pan
Carte
Configure MapReduce jobs (Windows only)
On Windows, update mapred-site.xml so MapReduce jobs run with secure impersonation.
Open:
<username>/.pentaho/metastore/pentaho/NamedCluster/Configs/<connection name>/mapred-site.xmlAdd these properties:
Save the file.
Connect to a Cloudera Impala database (Cloudera only)
If you connect to a secure Impala database, update the PDI database connection options.
Download the Cloudera Impala JDBC driver: https://www.cloudera.com/downloads/connectors/impala/jdbc/2-6-15.html
Secure impersonation with Impala is supported only with the Cloudera Impala JDBC driver.
Extract
ImpalaJDBC41.jarinto:<username>/.pentaho/metastore/pentaho/NamedCluster/Configs/cdp71/libCreate a database connection in the PDI client.
Set these general values:
Connection Type: Cloudera Impala
Database Name:
defaultPort Number:
443
On Options, set:
KrbHostFQDN: Fully qualified domain name of the Impala hostKrbServiceName: Service principal name of the Impala serverKrbRealm: Kerberos realm used by the cluster
Select Test.
Next steps
When the cluster is connected to the Pentaho Server, you can run jobs and transformations using secure impersonation.
Secure impersonation from the PDI client is not supported.
Kerberos authentication
Use Kerberos to authenticate access to secure Hadoop and CDP components.
In this section
Set up Kerberos for Pentaho
How you set up Kerberos on a machine that the Pentaho Server can access depends on your operating system.
Configure Kerberos
To configure Kerberos, complete the tasks for your operating system.
Configure JCE
The KDC configuration uses an “unlimited” AES-256 encryption setting by default for the Java Cryptographic Extension (JCE) files. Cryptographic policy requirements vary by country.
Do these steps only if you must reduce the encryption strength:
Open
pentaho/java/conf/security/java.security.Find
crypto.policyand set it to:crypto.policy=limitedSave the file.
Modify the Kerberos configuration file
Open
krb5.conf. The default location is/etc/krb5.conf.Add your realm, KDC, and admin server values. Example:
Save the file.
Restart the machine.
Synchronize clocks
Synchronize the client clock with the cluster clock. Kerberos fails if timestamps drift too far.
Obtain a Kerberos ticket
Run
kinit.Enter the password when prompted.
Confirm the ticket exists by running
klist.
Configure JCE
The KDC configuration uses an “unlimited” AES-256 encryption setting by default for the Java Cryptographic Extension (JCE) files. Cryptographic policy requirements vary by country.
Do these steps only if you must reduce the encryption strength:
Open
pentaho\\java\\conf\\security\\java.security.Find
crypto.policyand set it to:crypto.policy=limitedSave the file.
Download and install Kerberos
Install a Kerberos client. Heimdal is a common option: https://www.secure-endpoints.com/heimdal/.
Modify the Kerberos configuration file
Open
krb5.conf. The default location isC:\\ProgramData\\Kerberos\\krb5.conf.Add your realm, KDC, and admin server values. Example:
Save the file.
Copy the file to
C:\\Windows\\krb5.ini.Restart the machine.
Synchronize clocks
Synchronize the client clock with the cluster clock. Kerberos fails if timestamps drift too far.
Obtain a Kerberos ticket
Run
kinit.Enter the password when prompted.
Confirm the ticket exists by running
klist.
If you use Heimdal, klist output should not show Current LoginId is ....
Set up user accounts and network access (all OS)
Ensure user accounts and network access exist before connecting.
Open the required network ports between the cluster and Pentaho components.
Confirm forward and reverse DNS resolution.
Create a Kerberos principal for each Pentaho user who needs access.
Ensure UID and GID match across all cluster nodes for the run user.
Next step
Continue cluster connection setup in the Install Pentaho Data Integration and Analytics guide.
Use Kerberos with MongoDB
If you use Kerberos to authenticate access to MongoDB, you can also use Kerberos to authenticate PDI users who access MongoDB through a transformation step.
When a user runs a transformation containing a MongoDB step, the step credentials are validated against the Kerberos administrative database. If the credentials match, the KDC grants a ticket.
In this section
Complete MongoDB and client prerequisites
Install and configure MongoDB Enterprise.
Configure MongoDB for Kerberos authentication.
Install the current PDI client on each client machine.
Verify forward and reverse DNS resolution for MongoDB hosts.
Add users to the Kerberos database
Add a Kerberos principal for each PDI client user who needs MongoDB access.
Sign in to the host that runs the Kerberos database as
root(or equivalent).Add a principal. Example:
The principal should match the user created in MongoDB.
Use Knox to access CDP
Apache Knox provides perimeter security for CDP services. It gives you a single gateway endpoint instead of per-service endpoints.
Knox typically authenticates a user via LDAP, then authenticates to Kerberos, then authorizes via Ranger.

Setup requirements for Knox with Pentaho
As a cluster administrator, provide this information to Pentaho users:
Credentials: Cluster name, gateway URL, username, and password.
SSL certificate: Knox URLs are HTTPS. Install the certificate.
See SSL Security.
LDAP directory server: Knox commonly authenticates users against LDAP.
See LDAP security.
Hive configuration with Knox
Open your Hive database connection.
In the Database Connection dialog, select Options.
Set these parameters:
httpPath:datahub_cluster_name/cdp-proxy-api/hiveknox(optional):truetransportMode:httpssl:true
In General, set Port number to
443.
You can now use the connection in Hive steps.
Legacy and moved pages
Some related pages are kept for backward compatibility and older navigation paths.
Last updated
Was this helpful?

