Sqoop Export
You can use the Sqoop Export job entry to export data from Hadoop into an RDBMS by using Apache Sqoop.
You can create, edit, and select a Hadoop cluster configuration. Cluster configuration settings can be reused in other transformation steps and job entries that support Hadoop.
This job entry has two setup modes:
Quick Setup: Provides the minimum options needed to run a Sqoop export (default).
Advanced Options: Provides additional options to manage your export, including a command-line view that you can use to reuse an existing Sqoop command.
For more information about Apache Sqoop, see http://sqoop.apache.org/.
General
Name: Specify the unique name of the job entry on the canvas. You can customize the name or leave it as the default.
Advanced Options: Select Advanced Options to switch to Advanced Options mode. In Advanced Options mode, select Quick Setup to return to Quick Setup mode.
Quick Setup mode

Source
The source refers to the Hadoop cluster where the data you want to export is stored.
Hadoop Cluster
The Hadoop cluster that contains the data for export.
Select Choose Available to pick an existing cluster configuration. If you do not have any existing cluster connections, select New.
For Hadoop information, see Use Hadoop with Pentaho.
Edit
Edit the selected Hadoop cluster configuration.
New
Create a new Hadoop cluster configuration.
Export Directory
The HDFS directory that contains the data you want to export.
Browse
Browse the cluster file system and select the directory that contains your export data.
Note: Browse works only when you have a valid cluster connection configured and selected.
Open File dialog box
When you have a valid cluster connection, select Browse to open the Open File dialog box.
Open from Folder
The path and name of the HDFS directory you are browsing. This directory becomes the active directory.
Up One Level
Display the parent directory of the active directory.
Delete
Delete a folder from the active directory.
Create Folder
Create a new folder in the active directory.
Active Directory Contents
Display the contents of the active directory.
Filter
Filter the items displayed in the active directory contents.
Target
The target refers to the database where you want to export the data.
Database Connection
Select Choose Available to select an existing database connection.
If you do not have an existing connection, select New. To modify an existing connection, select Edit.
Edit
Edit the selected database connection.
New
Create a new database connection. For more information, see Define data connections.
Table
The destination table name. If your database requires a schema, use SCHEMA.TABLE_NAME. The table must already exist and its structure must match the source data format.
Browse
Browse configured database connections by using the Database Explorer.
Advanced Options mode
The Advanced Options mode displays List View by default.
If you configured values in Quick Setup mode, those values display in List View.

List View
View and edit settings as Argument/Value pairs on the Default tab.
Use the Custom tab to add your own argument/value pairs.
You can also configure advanced exports such as exports from Hive or HBase.
Command Line View
Enter command-line arguments. A typical use case is pasting an existing Sqoop command line into this field.
Last updated
Was this helpful?

