> For the complete documentation index, see [llms.txt](https://docs.pentaho.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.pentaho.com/pdia-admin/9.3-administer/secure-the-pentaho-system/big-data-security/how-to-enable-secure-impersonation.md).

# How to enable secure impersonation

Here is how to enable secure impersonation and how the Pentaho Server processes that request. The mapping value `simple` in the driver configuration file turns on the secure impersonation. This value is set when you specify impersonation settings while creating a named connection. See the **Pentaho Data Integration** document for instructions on creating a named connection.

## Understanding secure impersonation

When the Pentaho Server starts, it verifies the mapping type value in the configuration file. If the value is **disabled** or **blank**, then the server does not use authentication when connecting to the cluster. The Pentaho Server cannot log onto a secured cluster if the value is set to disabled or blank. If the value is **simple**, then requests are evaluated for origination from the PDI client tool (Spoon) or from the Pentaho Server. If the request comes from a client tool, then Kerberos authentication is used; if the request comes from the Pentaho Server, then the request is evaluated to see whether the service component supports secure impersonation. If the component does not support secure impersonation, the request uses Kerberos authentication. If the component supports secure impersonation, then the request will use secure impersonation.

When impersonation is successful, the Pentaho Server log will report

```
"Everything looks good! [Hadoop User] is successfully impersonating as [Pentaho User]."
```

![Secure impersonation overview](/files/zVNug9jLdefx6zIbiiRP)

**Note:** If you change the mapping type value in the configuration file, you must restart the server for it to take effect.

## Use secure impersonation with a cluster

You can establish secure impersonation depending on the options you select when creating or editing a named connection as you connect to a cluster with the PDI client. If your environment requires advanced settings, your server is on Windows, or you are using a Cloudera Impala database, you should consider applying optional manual and advanced configurations for secure impersonation on the Pentaho Server.

The following sections guide you through the optional manual setup and advanced configurations:

* Prerequisites
* Manually configuring secure impersonation parameters
* Configuring MapReduce jobs (Windows-only)
* Connecting to a Cloudera Impala database (Cloudera-only)
* Next Steps

For an overview of secure impersonation, refer to [Setting Up Big Data Security](/pdia-admin/9.3-administer/secure-the-pentaho-system/big-data-security.md).

### Prerequisites

The following requirements must be met to use secure impersonation:

* The cluster must be secured with Kerberos, and the Kerberos server used by the cluster must be accessible to the Pentaho Server.
* The Pentaho computer must have Kerberos installed and configured. See [Set Up Kerberos for Pentaho](/pdia-admin/9.3-administer/secure-the-pentaho-system/big-data-security/how-to-enable-kerberos-authentication/set-up-kerberos-for-pentaho.md) for instructions.

**Note:** If your system has version 8 of the Java Runtime Environment (JRE) or the Java Developer's Kit (JDK) installed, you will not need to install the Kerberos client, since it is included in the Java installation. You will need to modify the Kerberos configuration file, `krb5.conf`, as specified in the [Set Up Kerberos for Pentaho](/pdia-admin/9.3-administer/secure-the-pentaho-system/big-data-security/how-to-enable-kerberos-authentication/set-up-kerberos-for-pentaho.md) article.

* A Pentaho driver for your Hadoop cluster must be installed and a named connection in the PDI client created. See the **Pentaho Data Integration** document for instructions.

### Manually configuring secure impersonation parameters

If you prefer an automated setup and authentication configuration of secure impersonation, you can use the security options while creating a named connection in the PDI client. See the **Install Pentaho Data Integration and Analytics** document for instructions. This section explains how to manually configure secure impersonation if you are not using the PDI client or need more advanced configurations.

The mapping types value in the `config.properties` file turns secure impersonation on or off. The mapping types supported by the Pentaho Server are **disabled** and \*\*simple.\*\*When set to **disabled** or left blank, the Pentaho Server does not use authentication. When set to **simple**, the Pentaho users can connect to the Hadoop cluster as a proxy user.

**Note:** If you are using these instructions for manually configuring secure impersonation by eduting the `config.properties` file, you do not need to follow the steps in **Edit config.properties (Secured Clusters)** of the Hadoop distribution instructions listed in **Install Pentaho Data Integration and Analytics**.

Perform the following steps to manually set up secure impersonation for your Hadoop cluster and PDI:

1. Stop the Pentaho Server.
2. Navigate to the `<username>/.pentaho/metastore/pentaho/NamedCluster/Configs/<user-defined connection name>` directory and open the `config.properties` file with a text editor.

   **Note:** This filepath and the `config.properties` file are created when you set up your named connection. See the **Pentaho Data Integration** document for instructions on creating a named connection.
3. Modify the `config.properties` file with the values in the following table:

   | Parameter                                                                             | Value                                                                                                                  |
   | ------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------- |
   | **pentaho.authentication.default.kerberos.principal**                                 | ​`exampleUser@EXAMPLE.COM`                                                                                             |
   | **pentaho.authentication.default.kerberos.keytabLocation**                            | Set the Kerberos keytab. You only need to set the password or the keytab, not both.                                    |
   | **pentaho.authentication.default.kerberos.password**                                  | Set the Kerberos password. You only need to set the password or the keytab, not both.                                  |
   | **pentaho.authentication.default.mapping.impersonation.type**                         | simple                                                                                                                 |
   | **pentaho.authentication.default.mapping.server.credentials.kerberos.principal**      | `exampleUser@EXAMPLE.COM`                                                                                              |
   | **pentaho.authentication.default.mapping.server.credentials.kerberos.keytabLocation** | You only need to set the password or the keytab, not both.                                                             |
   | **pentaho.authentication.default.mapping.server.credentials.kerberos.password**       | You only need to set the password or the keytab, not both.                                                             |
   | **pentaho.oozie.proxy.user**                                                          | Add the proxy user's name if you plan to access the Oozie service through a proxy. Otherwise, leave it set to `oozie`. |

   In this table, `exampleUser@EXAMPLE.COM` is provided as a sample of how you would specify your proxy user. If you have key-value pairs in your existing `config.properties` file that are not security related, merge those settings into the file.
4. Save and close the `config.properties` file.
5. Restart the Pentaho Server.

### Configuring MapReduce jobs

If you are trying to establish secure impersonation on a Windows system, you must modify the `mapred-site.xml` file to run MapReduce jobs for secure impersonation.

Perform the following steps to modify the `mapred-site.xml` file for secure impersonation:

1. Navigate to the `<username>/.pentaho/metastore/pentaho/NamedCluster/Configs/<user-defined connection name>` directory and open the `mapred-site.xml` file with a text editor.
2. Add the following two properties to the `mapred-site.xml` file:

   ```xml
   <property>
     <name>mapreduce.app-submission.cross-platform</name>
     <value>true</value>
   </property>
   <property>
     <name>mapreduce.framework.name</name>
     <value>yarn</value>
   </property>
   ```
3. Save and close the file.

### Connecting to a Cloudera Impala database

If you are trying to establish secure impersonation with a Cloudera Hadoop cluster and you are connecting to a secure Cloudera Impala database, you must update security-specific settings on the PDI database connection.

Perform the following steps to update your connection to the secure Cloudera Impala database:

1. Download the Cloudera Impala JDBC driver for your operating system from the Cloudera web site <https://www.cloudera.com/downloads/connectors/impala/jdbc/2-6-15.html>

   **Note:** Secure impersonation with Impala is only supported with the Cloudera Impala JDBC driver. You may have to create an account with Cloudera to download the driver file.
2. Extract the `ImpalaJDBC41.jar` file from the downloaded zip file into the folder `<username>/.pentaho/metastore/pentaho/NamedCluster/Configs/cdh61/lib`. The `ImpalaJDBC41.jar` file is the only file to extract from the downloaded file.
3. Connect to a secure CDH cluster.

   If you have not set up a secure cluster, complete the procedure for setting up Pentaho to connect to a secure cluster found in the **Install Pentaho Data Integration and Analytics** document.
4. Start the PDI client and choose **File** > **New** > **Transformation** to add a new transformation.

   See the **Pentaho Data Integration** document for instructions on starting the PDI client.
5. Click the **View** tab, then right-click **Database Connections**and choose **New**.
6. In the **Database Connection** dialog box enter the values from the following table:

   | Field               | Value               |
   | ------------------- | ------------------- |
   | **Connection Name** | `User-defined name` |
   | **Connection Type** | Cloudera Impala     |
   | **Host Name**       | `Hostname`          |
   | **Database Name**   | default             |
   | **Port Number**     | 21050               |
7. Click **Options** in the left pane of the **Database Connection** dialog box and enter the parameter values as shown in the following table:

   | Parameter          | Value                                              |
   | ------------------ | -------------------------------------------------- |
   | **KrbHostFQDN**    | The fully qualified domain name of the Impala host |
   | **KrbServiceName** | The service principal name of the Impala server    |
   | **KrbRealm**       | The Kerberos realm used by the cluster             |
8. Click **Test** when your settings are entered.

A success message appears if everything was entered correctly.

### Next steps

When you save your changes in the repository and your Hadoop cluster is connected to the Pentaho Server, you are now ready to use secure impersonation to run your transformations and jobs from the Pentaho Server.

**Note:** Secure impersonation from the PDI client is not currently supported.

See the **Install Pentaho Data Integration and Analytics** document for instructions on any further advance configurations you may need to perform to connect your Hadoop cluster to the Pentaho Server.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.pentaho.com/pdia-admin/9.3-administer/secure-the-pentaho-system/big-data-security/how-to-enable-secure-impersonation.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
