Bulk load into Azure SQL DB
Use the Bulk load into Azure SQL DB job entry to load CSV data from Azure Data Lake Storage into an Azure SQL Database.
For more information about Azure SQL, see the Azure SQL documentation.
Before you begin
To use this job entry, you need:
An Azure account.
A connection to the Azure SQL database where you want to load data.
A VFS connection to Azure Data Lake Storage.
A table and schema in the target database. The table must include all required columns.
You need the INSERT and ADMINISTER DATABASE BULK OPERATIONS permissions in the Azure SQL database.
General
The following field is available on the General tab:
Entry Name: Specify the unique name of the job entry on the canvas. You can customize the name or leave it as the default.
Options
This job entry includes four tabs. Each tab is described in the following sections.
Input tab

Use the options in this tab to define your input source:
Source
Specify the VFS URL of the data to import. The URL can point to a file or a folder in Azure Data Lake Storage. The URL must begin with pvfs:// and must specify the bucket and object you want to load. See Pentaho address to a VFS connection.
What file type is your source?
Select the file type of the input source. The source must be a CSV file.
Delimited text settings
Specify the following settings for a delimited text file:
Delimiter
Specify the character used to separate a data field. The default is semicolon (;).
First line is column names
Select Yes if the first line in the file contains the column names.
Output tab

Use the options on the Output tab to specify the destination:
Database connection
Select your database connection from a list of existing Azure SQL DB connections.
If you do not have an existing connection, click New. If you need to modify an existing connection, click Edit.
Schema
Select the schema to use for the bulk load. The job entry reads the schemas that exist in the database to populate this list.
Table name
Select the name of the table to bulk load. The job entry reads the table names from the selected schema to populate this list.
Columns
Preview the column names and data types in the selected table.
You can use variables in the Schema and Table name fields.
Options tab

Use the options on the Options tab to define how data is loaded into the destination table:
Truncate table
Select to remove all data in the table before loading the current data. Clear to append data during a new data load (default).
Advanced options tab

Use this tab to configure additional parameters for the bulk load.
Any Name/Value pair that you add is passed to the Azure SQL database as a parameter. You are responsible for validating parameter names and values.
The Force parameter is provided as an example.
Force
Specify whether to force loading files into a database:
True
Loads data to the table even if the data was already loaded from that file before. This option can duplicate data in a table.
False
Ignores staged data files already loaded into the table (default).
Last updated
Was this helpful?

