> For the complete documentation index, see [llms.txt](https://docs.pentaho.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.pentaho.com/pdia-admin/9.3-administer/secure-the-pentaho-system/big-data-security/how-to-enable-kerberos-authentication/use-knox-to-access-hortonworks.md).

# Use Knox to access Hortonworks

You can use Knox to provide secure access to the Hadoop components on a cluster. Apache Knox is a gateway security tool that provides perimeter security for the Hortonworks Distribution (HDP) of Hadoop services. Connecting to a cluster using Knox provides you with a single point of access to connect to Hadoop services, eliminating the need to map to each service separately. If your system administrator has implemented Apache Ranger on the cluster, Pentaho will respect the policies your system administrator has set up.

Here is an example of a Knox deployment:

The PDI client connects to Knox using a user ID and password that is registered in LDAP. Knox then authenticates to the Kerberos Key Distribution Center (KDC) with the PDI client user ID and password. Lastly, Knox authorizes with Ranger and submits the request to the Hadoop cluster.

![Knox environment](/files/hlCZBmpxYC0gkq9TAeqm)

### Setup requirements for Knox with Pentaho

As a system or cluster administrator, you must obtain the following information and provide it to your Pentaho users:

* **Credentials**

  The cluster name, gateway URL, username, and password.
* **SSL certificate**

  The SSL certificate must be installed. The Knox URL is a secure URL. You need an SSL certificate to successfully perform operations using a Knox gateway. See [Configure SSL (HTTPS) in the Pentaho User Console and Server](broken://pages/4DfPgRqPjsHRbUQMCl9n) for information on SSL.
* **LDAP directory server**

  Authentication with Knox is provided by an LDAP directory server. You must be able to authenticate to an LDAP server. For more information, review the articles [Switch to LDAP](https://github.com/pentaho/documentation/blob/main/PDIA/9.3/Administer/Secure%20the%20Pentaho%20system/Secure%20the%20Pentaho%20System/User%20security/Advanced%20security%20providers/LDAP%20security/Switch%20to%20LDAP=GUID-60A43274-459A-4593-ACC1-D887E2C4BA49=5=en=.md) and [LDAP Properties](broken://pages/FWW39FknmbQAhqnMFMMx).

### Hive configuration with Knox

You can configure your Hive database with Knox.

1. Open the connection to your Hive database, or review **Define data connections** in the **Install Pentaho Data Integration and Analytics** document for instructions on setting up a connection.
2. In the **Database Connection** dialog box, select **Options** in the page panel on the left to display the **Parameters** panel.
3. Enter the following parameters and values in the **Options** section and click **OK**.

   | Parameter         | Definition                 | Value                       |
   | ----------------- | -------------------------- | --------------------------- |
   | **httpPath**      | Path to database           | `gateway/MyHDPCluster/hive` |
   | **knox**          | Option to use Knox         | *true*                      |
   | **transportMode** | Connection protocol to use | *http*                      |
   | **ssl**           | Option to use SSL          | *true*                      |

   ![Database Connection dialog box](/files/xPRkY0ywt9AEkeBND2tJ)

You are now ready to use this connection for any Hive steps.

See the **Install Pentaho Data Integration and Analytics** document for further configuration information when using Hive with Spark on AEL.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.pentaho.com/pdia-admin/9.3-administer/secure-the-pentaho-system/big-data-security/how-to-enable-kerberos-authentication/use-knox-to-access-hortonworks.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
