# Sqoop Import

You can use the **Sqoop Import** job entry to import data from a relational database into the Hadoop Distributed File System (HDFS) by using Apache Sqoop.

{% hint style="warning" %}
**Important:** When using Sqoop, the Java version on the client must match the major and minor version of the JDK used by the cluster that runs the job. If the versions do not match, an error occurs.
{% endhint %}

You can create, edit, and select a Hadoop cluster configuration. Cluster configuration settings can be reused in other transformation steps and job entries that support Hadoop.

This job entry has two setup modes:

* **Quick Setup**: Provides the minimum options needed to run a Sqoop import (default).
* **Advanced Options**: Provides additional options to manage your import, including a command-line view that you can use to reuse an existing Sqoop command.

For more information about Apache Sqoop, see <http://sqoop.apache.org/>.

### General

* **Name**: Specify the unique name of the job entry on the canvas. You can customize the name or leave it as the default.
* **Advanced Options**: Select **Advanced Options** to switch to **Advanced Options** mode. In Advanced Options mode, select **Quick Setup** to return to Quick Setup mode.

### Quick Setup mode

![Sqoop Import step Quick Setup mode](/files/2taq62fSHYS7rxuQv61e)

#### Source

The source refers to the database that contains the data you want to import.

| Option                  | Definition                                                                                                                                                                                                                               |
| ----------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Database Connection** | <p>Select <strong>Choose Available</strong> to select an existing database connection.</p><p>If you do not have an existing connection, select <strong>New</strong>. To modify an existing connection, select <strong>Edit</strong>.</p> |
| **Edit**                | Edit the selected database connection.                                                                                                                                                                                                   |
| **New**                 | Create a new database connection. For more information, see [Define data connections](https://help.hitachivantara.com/Documentation/Pentaho/9.5/Setup/Define_data_connections).                                                          |
| **Table**               | The source table name. If your database requires a schema, use `SCHEMA.TABLE_NAME`. The table must exist in the source database.                                                                                                         |
| **Browse**              | Browse configured database connections by using the Database Explorer.                                                                                                                                                                   |

#### Target

The target refers to the Hadoop cluster and HDFS directory where you want to write the imported data.

| Option               | Definition                                                                                                                                                                                                                                            |
| -------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Hadoop Cluster**   | <p>Select an existing Hadoop cluster configuration or create a new one.</p><p>For Hadoop information, see <a href="https://help.hitachivantara.com/Documentation/Pentaho/9.5/Work_with_data/Use_Hadoop_with_Pentaho">Use Hadoop with Pentaho</a>.</p> |
| **Target Directory** | The HDFS directory where you want to write the imported data.                                                                                                                                                                                         |
| **Browse**           | <p>Browse the cluster file system and select a target directory.</p><p><strong>Note:</strong> Browse works only when you have a valid cluster connection configured.</p>                                                                              |

<details>

<summary>Open File dialog box</summary>

When you have a valid cluster connection, select **Browse** to open the **Open File** dialog box.

| Option                        | Definition                                                                                             |
| ----------------------------- | ------------------------------------------------------------------------------------------------------ |
| **Open from Folder**          | The path and name of the HDFS directory you are browsing. This directory becomes the active directory. |
| **Up One Level**              | Display the parent directory of the active directory.                                                  |
| **Delete**                    | Delete a folder from the active directory.                                                             |
| **Create Folder**             | Create a new folder in the active directory.                                                           |
| **Active Directory Contents** | Display the contents of the active directory.                                                          |
| **Filter**                    | Filter the items displayed in the active directory contents.                                           |

</details>

### Advanced Options mode

The **Advanced Options** mode displays **List View** by default.

{% hint style="info" %}
If you configured values in **Quick Setup** mode, those values display in **List View**.
{% endhint %}

![Sqoop import step Advanced Options mode](/files/ULGGOhXYh0kja79wwwu6)

| Option                | Definition                                                                                                                                                                                                   |
| --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| **List View**         | <p>View and edit settings as <strong>Argument</strong>/<strong>Value</strong> pairs on the <strong>Default</strong> tab.</p><p>Use the <strong>Custom</strong> tab to add your own argument/value pairs.</p> |
| **Command Line View** | Enter command-line arguments. A typical use case is pasting an existing Sqoop command line into this field.                                                                                                  |


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pentaho.com/pdia-data-integration/pdi-job-entries-reference-overview/sqoop-import-job.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
