# Connecting to a Hadoop cluster with the PDI client

To connect to a Hadoop cluster, you must access a driver, create a named connection, then configure and test your connection. A named connection is information, including the IP address and port number, used to connect to the Hadoop cluster which is then stored by the name you assign to the connection for later use. You can create named connections to any supported vendor cluster and vendor version.

After you have a named connection set up, you can edit or duplicate that connection. For example, if you want to use a configuration with different security credentials, you can duplicate a connection, then edit the security settings on the copy. Named connections are useful when you move your jobs and transformations from a development server to a production server because you only need to update the connection information for your cluster name in the Hadoop Clusters dialog box. The jobs and transformations use the new connection information from the named connection.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pentaho.com/pdia-data-integration/9.3-data-integration/advanced-topics-pentaho-data-integration-overview/connecting-to-a-hadoop-cluster-with-the-pdi-client-article.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
