> For the complete documentation index, see [llms.txt](https://docs.pentaho.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.pentaho.com/install/10.2-install/pentaho-installation-overview-cp/hyperscalers-landing-page/installing-pentaho-on-aws/installing-the-platform-or-pdi-server-on-aws/create-an-s3-bucket-for-platform-or-pdi-server.md).

# Create an S3 bucket

Create an S3 bucket only if you want to take one or more of the following actions. Otherwise proceed to create the EKS cluster.

* Add third party JAR files like JDBC drivers or custom JAR files for Pentaho to use.
* Customize the default Pentaho configuration.
* Replace the server files.
* Upload or update the metastore.
* Add files to the Platform and PDI Server's `/home/pentaho/.kettle` directory. This is mapped to the "KETTLE\_HOME\_DIR" environment variable, which is used by the `content-config.properties` file.

1. Create an S3 bucket.

   To create an S3 bucket, see [Creating a bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html).

   To upload a file to S3, see [Uploading objects](https://docs.aws.amazon.com/AmazonS3/latest/userguide/upload-objects.html).
2. Record the newly created S3 bucket name in the [Worksheet for AWS hyperscaler](/install/10.2-install/pentaho-installation-overview-cp/hyperscalers-landing-page/installing-pentaho-on-aws/running-pdi-cli-on-aws/worksheet-for-aws-hyperscaler-common.md).
3. Upload files into the S3 bucket.

   After the S3 bucket is created, manually create any needed directories as shown in the following table and upload the relevant files to an appropriate directory location by using the AWS Management Console.

   The following table lists the relevant Pentaho directories and the actions related to each directory.

| Directory                  | Actions                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| -------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| /root                      | <p>All the files in the S3 bucket are copied to the Platform and PDI Server's <code>/home/pentaho/.kettle</code> directory.</p><p>If you must copy a file to the <code>/home/pentaho/.kettle</code> directory, drop the file in the <code>root</code> directory of the S3 bucket.</p>                                                                                                                                                                                                                                                                  |
| custom-lib                 | <p>If Pentaho needs custom JAR libraries, add the<code>custom-lib</code> directory to the S3 bucket and place the libraries there.</p><p>Any files within this directory will be copied to Pentaho’s <code>lib</code> directory.</p>                                                                                                                                                                                                                                                                                                                   |
| Jdbc-drivers               | <p>If the Pentaho installation needs JDBC drivers, do the following:</p><ol><li>Add the <code>jdbc-drivers</code> directory to the S3 bucket.</li><li>Place the drivers in this directory.<br>Any files within this directory will be copied to Pentaho’s <code>lib</code> directory.</li></ol>                                                                                                                                                                                                                                                        |
| plugins                    | <p>If the Pentaho installation needs additional plugins installed, do the following:</p><ol><li>Add the <code>plugins</code> directory to the S3 bucket.</li><li>Copy the plugins to the <code>plugins</code> directory.<br>Any files within this directory are copied to Pentaho’s <code>plugins</code> directory. For this reason, the plugins should be organized in their own directories as expected by Pentaho.</li></ol>                                                                                                                        |
| drivers                    | <p>If the Pentaho installation needs big data drivers installed, do the following:</p><ol><li>Add the <code>drivers</code> directory to the S3 bucket.</li><li>Place the big data drivers in this directory.<br>Any files placed within this directory will be copied to Pentaho’s <code>drivers</code> directory.</li></ol>                                                                                                                                                                                                                           |
| metastore                  | <p>Pentaho can execute jobs and transformations. Some of these require additional information that is usually stored in the Pentaho metastore.</p><p>If you must provide the Pentaho metastore to Pentaho, copy the local <code>metastore</code> directory to the root of the S3 Storage bucket. From there, the <code>metastore</code> directory is copied to the proper location within the Docker image.</p>                                                                                                                                        |
| server-structured-override | <p>The <code>server-structured-override</code> directory is the last resort if you want to make changes to any other files in the image at runtime.</p><p>For example, you can use it for configuring authentication and authorization.</p><p>Any files and directories within this directory will be copied to the <code>pentaho-server</code> directory the same way they appear in the <code>server-structured-override</code> directory.</p><p>If the same files exist in the <code>pentaho-server</code> directory, they will be overwritten.</p> |

The following table lists the relevant Pentaho files and the actions related to each file.

| File                      | Actions                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| ------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| context.xml               | <p>The Pentaho configuration YAML is included with the image in the <code>templates</code> project directory and is used to install this product. You must set the RDS host and RDS port parameters when you install Pentaho. Upon installation, the parameters in the configuration YAML are used to generate a custom <code>context.xml</code> file for the Pentaho installation so it can connect to the database-specific repository.</p><p>If these are the only changes required in the <code>context.xml</code>, you don’t need to provide a <code>context.xml</code> in the S3 bucket. On the other hand, if you must configure additional parameters in the <code>context.xml</code>, you must provide the <code>custom.xml</code> file in the S3 bucket.</p><p>In the context.xml template, replace the \<RDS\_HOST\_NAME> and \<RDS\_PORT> entries with the values you recorded on the <a href="/pages/oq9Dq3OGxsn7u2V4lnVQ">Worksheet for AWS hyperscaler</a>.</p> |
| content-config.properties | <p>The <code>content-config.properties</code> file is used by the Pentaho Docker image to provide instructions on, which S3 files to copy over and their location.</p><p>The instructions are populated as multiple lines in the following format:</p><p><code>${KETTLE\_HOME\_DIR}/\<some-dir-or-file>=${SERVER\_DIR}/\<some-dir></code>A template for this file can be found in the templates project directory.</p><p>The template has an entry where the file <code>context.xml</code> is copied to the required location within the Docker image:</p><p><code>${KETTLE\_HOME\_DIR}/context.xml=${SERVER\_DIR}/tomcat/webapps/pentaho/META-INF/context.xml</code></p>                                                                                                                                                                                                                                                                                                      |
| content-config.sh         | <p>A bash script that can be used to configure files, change file and directory ownership, move files around, install missing apps, and so on.</p><p>You can add the script to the S3 bucket.</p><p>The script is executed in the Docker image after the other files are processed.</p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| metastore.zip             | <p>Pentaho can execute jobs and transformations. Some of these require additional information that is usually stored in the Pentaho metastore.</p><p>If you must provide the Pentaho metastore to Pentaho, zip the content of the <code>local.pentaho</code> directory with the name <code>metastore.zip</code> and add it to the root of the Cloud Storage bucket. The <code>metastore.zip</code> file is extracted to the proper location within the Docker image.</p><p><strong>Note:</strong> The VFS connections cannot be copied to the hyperscaler server from PDI the same way as the named connection. You must connect to Pentaho on the hyperscaler and create the new VFS connection.</p>                                                                                                                                                                                                                                                                          |

For instructions on how to dynamically update server configuration content from the S3 bucket, see \[Dynamically update server configuration content from S3]\(Dynamically%20update%20server%20configuration%20content%20from%20S3.md).


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.pentaho.com/install/10.2-install/pentaho-installation-overview-cp/hyperscalers-landing-page/installing-pentaho-on-aws/installing-the-platform-or-pdi-server-on-aws/create-an-s3-bucket-for-platform-or-pdi-server.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
