Create an S3 bucket for PDI-CLI

You must create an S3 bucket to deploy the PDI-CLI image on AWS.

  1. Create an S3 bucket .

    To create an S3 bucket, see Creating a bucket.

    To upload a file to S3, see Uploading objects.

  2. Record the newly created S3 bucket name in the Worksheet for AWS hyperscaler.

  3. Upload files into the S3 bucket.

    After the S3 bucket is created, manually create any needed directories as shown in the following table and upload the relevant files to an appropriate directory location by using the AWS Management Console.

    The following table lists the relevant Pentaho directories and actions related to each directory.

Directory

Actions

/root

All the files in the S3 bucket are copied to the /home/pentaho/data-integration/data directory.

If you must copy a file to the /home/pentaho/data-integration/data directory, drop the file in the root directory of the S3 bucket.

Jdbc-drivers

If the Pentaho installation needs JDBC drivers, do the following:

  1. Add the jdbc-drivers directory to the S3 bucket.

  2. Place the drivers in this directory. Any files within this directory will be copied to Pentaho’s lib directory.

plugins

If the Pentaho installation needs additional plugins installed, do the following:

  1. Add the plugins directory to the S3 bucket.

  2. Copy the plugins to the plugins directory. Any files within this directory are copied to Pentaho’s plugins directory. For this reason, the plugins should be organized in their own directories as expected by Pentaho.

metastore

Pentaho can execute jobs and transformations. Some of these require additional information that is usually stored in the Pentaho metastore.

If you must provide the Pentaho metastore to Pentaho, copy the local .pentaho directory to the metastore directory (you can name it something else by passing a variable) of the S3 storage bucket. From there, the content of the .pentaho directory is copied to the /home/pentaho/.pentaho folder within the Docker image.

The following table lists the relevant Pentaho files and the actions related to each file.

File

Actions

content-config.properties

The content-config.properties file is used by the Pentaho Docker image to provide instructions on, which S3 files to copy over and their location.

The instructions are populated as multiple lines in the following format:

${KETTLE_HOME_DIR}/<some-dir-or-file>=${APP_DIR}/<some-dir>A template for this file can be found in the templates project directory.

The template has an entry where the file context.xml is copied to the required location within the Docker image:

${KETTLE_HOME_DIR}/context.xml=${APP_DIR}/context.xml

content-config.sh

A bash script that can be used to configure files, change file and directory ownership, move files around, install missing apps, and so on.

You can add it to the S3 bucket.

The script is executed in the Docker image after the other files are processed.

Last updated

Was this helpful?