# Configure Data Optimizer

Configure Data Optimizer by using the user interface for either Cloudera Manager or Ambari to set parameters in the Data Optimizer configuration file.

## Data Optimizer configuration parameters

The Data Optimizer management interface distributes the configuration information to the Data Optimizer volume nodes for use by the Data Optimizer volume service.

**CAUTION:** Never modify the **BUCKET** and **MOUNT\_POINT** parameters in the Data Optimizer configuration file after the initial installation. Changing these values after installation breaks the instance because the Data Optimizer instance ID is calculated based on the values provided in these parameters.

**Note:** Do not include leading or trailing spaces if you copy and paste parameter values. Ambari and Cloudera Manager do not validate input.

<table><thead><tr><th width="164.55560302734375">Parameter</th><th width="124.3333740234375">Requirement</th><th>Description</th></tr></thead><tbody><tr><td><strong>ENDPOINT</strong></td><td>Required</td><td>Endpoint address for Hitachi Content Platform. If the <strong>ENDPOINT_TYPE</strong> is <em>HCP</em>, use the form <code>tenant.hcp_dns_name</code>.</td></tr><tr><td><strong>ENDPOINT_TYPE</strong></td><td>Optional</td><td><p>Default endpoint type. Acceptable values are case sensitive. - If connecting to Hitachi Content Platform, use <em>HCP</em>.</p><ul><li>If connecting to Hitachi Content Platform for cloud scale, use <em>HCPCS</em>.</li><li>If connecting to Amazon S3, use <em>AWS</em>.</li></ul></td></tr><tr><td><strong>PDO_URL</strong></td><td>Required</td><td>The system name or IP address of the Data Catalog with which the Hadoop cluster communicates to handle migration tasks.</td></tr><tr><td><strong>DATASOURCE_ID</strong></td><td>Required</td><td>The unique ID assigned to the data source in the Data Catalog after registering the HDFS as a data source.</td></tr><tr><td><strong>PDO_SCHEDULER_INTERVAL</strong></td><td>Required</td><td>The time interval that specifies how often the Data Optimizer agent script queries the Data Catalog server for migration jobs and executes the migrations.</td></tr><tr><td><strong>BUCKET</strong></td><td>Required</td><td>Content Platform bucket name or a wildcard value of <em>instance_id</em>. You can use the unique ID generated by Content Platform (<em>instance_id</em>) as a wildcard to avoid name conflicts and to simplify configuration of the instances. Multiple instances can share a common configuration if you use the <em>instance_id</em> wildcard and all other values are identical. You cannot append or prepend the <em>instance_id</em> wildcard value to any other value. For example, <em>bucket_instance_id</em> is an invalid value. If Content Platform is properly configured, Data Optimizer creates its own bucket if the bucket does not already exist.</td></tr><tr><td><strong>ACCESS_KEY</strong></td><td>Required</td><td>S3 Access Key ID used to authenticate S3 requests to Content Platform.</td></tr><tr><td><strong>SECRET_KEY</strong></td><td>Required</td><td>S3 Secret Key used to authenticate S3 requests.</td></tr><tr><td><strong>PROTOCOL</strong></td><td>Optional</td><td>Protocol used to encrypt communication between Data Optimizer and Content Platform using TLS. The default value is https. Acceptable, case sensitive values are <em>https</em> and <em>http</em>.</td></tr><tr><td><strong>VERIFY_SSL_CERTIFICATE</strong></td><td>Optional</td><td>Value used to specify whether to verify certificates within Data Optimizer. Acceptable, case sensitive values are <em>true</em> and <em>false</em>. The default is value is true. If the <strong>VERIFY_SSL_CERTIFICATE</strong> parameter is set to <em>false</em>, certificate verification is disabled within Data Optimizer. Set this parameter to <em>false</em> when Content Platform is presenting a self-signed certificate, and you still want to use TLS to encrypt transmissions between Data Optimizer and Content Platform.</td></tr><tr><td><strong>MOUNT_POINT</strong></td><td>Required</td><td>HDFS DataNode local directory where Data Optimizer is mounted. The directory must exist and the HDFS user using Data Optimizer must have write permission for the directory. The directory must allow <code>rwx</code> permissions for the owner and owner’s group. For example:``` <code>mkdir</code> <strong>MOUNT_POINT</strong>*&#x3C;mount point>* <code>chown</code> <em>user</em>:<em>group</em> <strong>MOUNT_POINT</strong>*&#x3C;mount point>* <code>chmod</code> 770 <strong>MOUNT_POINT</strong>*&#x3C;mount point>*<br></td></tr><tr><td><strong>BUCKET_STORAGE_LIMIT_GB</strong></td><td>Required</td><td><p>Size in GB to report as the total capacity of the volume.<br></p><p><strong>CAUTION</strong>: If the usage exceeds the quota, or upper limit, on the volume’s Content Platform bucket, writes to the volume fail. Data Optimizer does not prevent writing to the volume if the usage exceeds the capacity.<br>As a best practice, specify a value that is less than the bucket quota, so that HDFS stops choosing the volume for writes before the volume exceeds its quota on Content Platform.</p></td></tr><tr><td><strong>CACHE_DIR</strong></td><td>Required</td><td><p></p><p>Directory that Data Optimizer uses to store temporary files associated with open file handles. If MD_STORE_DIR is not specified, Data Optimizer also uses this directory to store files associated with persisting the local metadata store. The directory must exist and the HDFS user using Data Optimizer must have write permission for the directory. The directory must allow rwx permissions for the owner and owner’s group. The CACHE DIR parameter must be a fully-qualified directory path starting at the system root (/). For example:</p><pre><code>mkdir CACHE_DIR cache dir
chown user:group CACHE_DIR cache dir
chmod 770 CACHE_DIR cache dir
</code></pre></td></tr><tr><td><strong>MD_STORE_DIR</strong></td><td>Optional</td><td>Local directory used to store files associated with persisting the Data Optimizer local metadata store. The <strong>MD_STORE_DIR</strong> parameter value must be a fully-qualified directory path starting at the system root (<code>/</code>). If an <strong>MD_STORE_DIR</strong> value is not specified, the <strong>CACHE_DIR</strong> directory is used. Specify a value for <strong>MD_STORE_DIR</strong> when the <strong>CACHE_DIR</strong> directory is located on volatile storage or if there is a more durable location for long-term file persistence. Do not choose a volatile storage medium for this directory, as it is intended to persist for the life of the Data Optimizer volume. For example, if you use transient storage for the <strong>CACHE_DIR</strong> directory, such as <code>RAM_DISK</code>, you should specify a more durable location for the <strong>MD_STORE_DIR</strong> directory. In addition, if you have a more durable location, such as a RAID partition, and there is room for the metadata store files (up to 2.5 GB), you should specify a <strong>MD_STORE_DIR</strong> directory on that partition. If the files associated with metadata store persistence are lost or corrupted, you can recover them as explained in <a href="/pages/216hfUiRE3F6JPhvxbUV#recovering-from-local-metadata-store-failure-or-corruption">Recovering from local metadata store failure or corruption</a>.</td></tr><tr><td><strong>RECOVERY_MODE</strong></td><td>Optional</td><td>Value used to specify whether recovery mode is enabled. Do not set the <strong>RECOVERY_MODE</strong> parameter unless you have read and understood the section <a href="/pages/216hfUiRE3F6JPhvxbUV#recovering-from-local-metadata-store-failure-or-corruption">Recovering from local metadata store failure or corruption</a>. The default value is false. Acceptable, case-sensitive values are <em>true</em> and <em>false</em>.</td></tr><tr><td><strong>LOG_LEVEL</strong></td><td>Optional</td><td>Value used to specify how verbose the logging is for Data Optimizer. The default value is INFO. Acceptable, case-sensitive values are <em>ALERT</em>, <em>ERR</em>, <em>WARNING</em>, <em>INFO</em>, and <em>DEBUG</em>. See <a data-mention href="/pages/216hfUiRE3F6JPhvxbUV#data-optimizer-logging">/pages/216hfUiRE3F6JPhvxbUV#data-optimizer-logging</a> for more details about logging and log levels.</td></tr><tr><td><strong>METRICS_FILE</strong></td><td>Optional</td><td>Local file that Data Optimizer writes metrics to when prompted by the <code>ldoctl metrics collect</code> command. The <strong>METRICS_FILE</strong>value must be a fully-qualified file path starting at the system root (<code>/</code>). If a <strong>METRICS_FILE</strong> value is not defined, Data Optimizer writes metrics to the system journal. The parent directory must exist and the HDFS user using Data Optimizer must have write permission for the directory. See <a data-mention href="/pages/216hfUiRE3F6JPhvxbUV#monitor-data-optimizer">/pages/216hfUiRE3F6JPhvxbUV#monitor-data-optimizer</a> for more information.</td></tr><tr><td><strong>LOG_SDK</strong></td><td>Optional</td><td>Local directory where detailed AWS S3 logs are saved. If the <strong>LOG_SDK</strong> parameter is specified and if <strong>LOG_LEVEL</strong> is set to <em>DEBUG</em>, Data Optimizer volumes log details about the S3 communication between the Data Optimizer instance and Content Platform. The directory must exist, must be a fully-qualified directory path starting at the system root (<code>/</code>), and the HDFS user using Data Optimizer must have write permission for the directory. See <a href="/pages/216hfUiRE3F6JPhvxbUV#aws-s3-sdk-logging">AWS S3 SDK logging</a> for more information.</td></tr></tbody></table>

### General Data Optimizer Configuration for Ambari

The following table list the parameters and respective description:

**CAUTION:** Never modify the **BUCKET** and **MOUNT\_POINT** parameters in the Data Optimizer configuration file after the initial installation. Changing these values after installation breaks the instance because the Data Optimizer instance ID is calculated based on the values provided in these parameters.

**Note:** Do not include leading or trailing spaces if you copy and paste parameter values. Ambari and Cloudera Manager do not validate input.

<table><thead><tr><th width="241.7777099609375">Parameter</th><th>Description</th></tr></thead><tbody><tr><td><strong>ENDPOINT_TYPE</strong></td><td><p>The type of S3 endpoint you are using. Acceptable, case sensitive values are <em>HCP</em>, <em>HCPCS</em>, and <em>AWS</em>. The default value is HCP. - If connecting to Hitachi Content Platform, use <em>HCP</em>.</p><ul><li>If connecting to Hitachi Content Platform for cloud scale, use <em>HCPCS</em>.</li><li>If connecting to Amazon S3, use <em>AWS</em>.</li></ul></td></tr><tr><td><strong>AWS_REGION</strong></td><td>The AWS region that Ambari connects to. The AWS_REGION value is required if <em>S3 Endpoint Type</em> is <em>AWS</em>.</td></tr><tr><td><strong>ENDPOINT</strong></td><td><p>The S3 endpoint URL for the object storage service.- If the <strong>ENDPOINT_TYPE</strong> is <em>HCP</em>, use the form <code>tenant.hcp_dns_name</code>.</p><ul><li>If the <strong>ENDPOINT_TYPE</strong> is <em>HCPCS</em>, use the form <code>hcpcs_dns_name</code>.</li><li>If the <strong>ENDPOINT_TYPE</strong> is <em>AWS</em>, you can leave the field blank or populate it with a region-specific S3 endpoint.</li></ul></td></tr><tr><td><strong>BUCKET</strong></td><td>S3 bucket used on the object store for all the backend storage of the Data Optimizer instances.</td></tr><tr><td><strong>ACCESS_KEY</strong></td><td>S3 Access Key ID used to authenticate S3 requests to the object store.</td></tr><tr><td><strong>SECRET_KEY</strong></td><td>S3 Secret Key used to authenticate S3 requests to the object store.</td></tr><tr><td><strong>ENDPOINT_SCHEME</strong></td><td>S3 Connection Scheme or Endpoint Scheme. Acceptable, case sensitive values are <em>https</em> and <em>http</em>. The default value is https. If set to <em>https</em>, Data Optimizer uses TLS to encrypt all communication with object storage.</td></tr><tr><td><strong>VERIFY_SSL_CERTIFICATE</strong></td><td><p>Value used to specify whether to verify certificates within the Data Optimizer volume. Acceptable, case sensitive values are <em>Enabled</em> and <em>Disabled</em>. The default value is Enabled. <br>If the <strong>ENDPOINT_SCHEME</strong> parameter is <em>https then s</em>et the VERIFY_SSL_CERTIFICATE parameter to enabled. Similarly, If the ENDPOINT_SCHEME parameter is then set the <strong>VERIFY_SSL_CERTIFICATE</strong> parameter to disabled.</p><p></p><p>By default, Content Platform uses a self-signed certificate that is not in the trust store on the HDFS DataNode. Disabling verification allows TLS negotiation to occur, despite the untrusted certificate. Disabling verification does not reduce the strength of TLS encryption, but it does disable endpoint authentication. It is a best practice to replace the Content Platform self-signed certificate with one signed by a trusted certificate authority. See the <strong>Hitachi Content Platform</strong> documentation for details.</p></td></tr><tr><td><strong>MOUNT_POINT</strong></td><td>HDFS DataNode local directory where the Data Optimizer instance is mounted. HDFS writes block replicas to the local directory you specify. The <strong>MOUNT_POINT</strong> parameter value must be a fully-qualified directory path starting at the system root (<code>/</code>).</td></tr><tr><td><strong>VOLUME_STORAGE_LIMIT_GB</strong></td><td>The storage capacity in GB of each Data Optimizer volume instance. If the combined usage of Data Optimizer volumes exceeds the quota allocated to their shared bucket on Content Platform, writes to those Data Optimizer volumes fail. The <strong>VOLUME_STORAGE_LIMIT_GB</strong> parameter value, multiplied by the number of Data Optimizer instances should not exceed the Content Platform quota. In fact, the Content Platform quota should include additional capacity for deleted versions and to account for asynchronous garbage collection services. HDFS writes only the amount of data to each Data Optimizer volume that is equal to or less than the amount specified in the <strong>HCP Bucket Storage Limit</strong> parameter, minus the reserved space (<code>dfs.datanode.du.reserved</code>).</td></tr><tr><td><strong>CACHE_DIR</strong></td><td>A local directory on the HDFS DataNode that Data Optimizer uses to store temporary files associated with open file handles. The <strong>CACHE DIR</strong> parameter must be a fully-qualified directory path starting at the system root (<code>/</code>).</td></tr><tr><td><strong>MD_STORE_DIR</strong></td><td>Local directory on each node used to store files associated with persisting the Data Optimizer local metadata store. The <strong>MD_STORE_DIR</strong> parameter value must be a fully-qualified directory path starting at the system root (<code>/</code>). Specify a value for <strong>MD_STORE_DIR</strong> when the <strong>CACHE_DIR</strong> directory is located is on volatile storage or if there is a more durable location for long term file persistence. Do not choose a volatile storage medium for this directory as it is intended to persist for the life of the Data Optimizer volume. If the files associated with metadata store persistence are lost or corrupted, you can recover them as explained in <a href="/pages/216hfUiRE3F6JPhvxbUV#recovering-from-local-metadata-store-failure-or-corruption">Recovering from local metadata store failure or corruption</a>.</td></tr><tr><td><strong>LOG_LEVEL</strong></td><td>Value used to specify how verbose the logging is for Data Optimizer. The default value is WARNING. Acceptable, case sensitive values are <em>ALERT</em>, <em>ERR</em>, <em>WARNING</em>, <em>INFO</em>, and <em>DEBUG</em>. See <a href="/pages/216hfUiRE3F6JPhvxbUV#data-optimizer-logging">Data Optimizer logging</a> for details about logging and log levels.</td></tr><tr><td><strong>LOG_SDK</strong></td><td>Optional. Local directory where detailed AWS S3 logs are saved. If the <strong>LOG_SDK</strong> parameter is specified and if <strong>LOG_LEVEL</strong> is set to <em>DEBUG</em>, Data Optimizer volumes log details about the S3 communication between the Data Optimizer volume instance and Content Platform. The <code>LOG_SDK</code> parameter value must exist, must be a fully-qualified directory path starting at the system root (<code>/</code>), and the HDFS user using Data Optimizer must have write permission for the directory. See <a href="/pages/216hfUiRE3F6JPhvxbUV#aws-s3-sdk-logging">AWS S3 SDK logging</a> for further details.</td></tr></tbody></table>

**Note**: The configuration file is located in the `/etc/ldo` directory on each HDFS DataNode on which both the Data Optimizer is installed, and the **ARCHIVE** volumes are configured.

### Settings for HTTP or HTTPS Proxy Connections

Using the settings in this section, you can configure Data Optimizer to use an HTTP or HTTPS proxy. In some cases, Data Optimizer is installed on a host that does not have direct access to the object storage service and must connect through a proxy. This is more likely to be the case when using a cloud storage provider such as Amazon Web Services. Using the settings in this section, you can configure Data Optimizer to use an HTTP or HTTPS proxy. If a proxy is not required, leave these settings at their defaults.

<table><thead><tr><th width="193.99993896484375">Parameter</th><th>Description</th></tr></thead><tbody><tr><td><strong>PROXY</strong></td><td>The IP address or domain name of the http or https proxy server, if required.</td></tr><tr><td><strong>PROXY_PORT</strong></td><td>The port that the proxy server listens on.</td></tr><tr><td><strong>PROXY_SCHEME</strong></td><td>The scheme is either <em>http</em> or <em>https</em>, depending on what the proxy server supports.</td></tr><tr><td><strong>PROXY_USER</strong></td><td>The user for the proxy server, if authentication is required.</td></tr><tr><td><strong>PROXY_PASSWORD</strong></td><td>The password for the proxy server, if authentication is required.</td></tr></tbody></table>

### Recovery Specific Configuration for Ambari

Use the following parameter to configure the recovery mode for Ambari.

**CAUTION:** Do not enable this parameter unless you have familiarized yourself with the [Maintain Data Optimizer metadata](/pdc-10.2-install/install-pentaho-data-catalog/install-pentaho-data-optimizer-in-hadoop-cluster/maintain-pentaho-data-optimizer.md) section and understand the implications.

<table><thead><tr><th width="197.33331298828125">Parameter</th><th>Description</th></tr></thead><tbody><tr><td><strong>RECOVERY_MODE</strong></td><td>Value used to determine whether recovery mode is enabled. The <strong>RECOVERY_MODE</strong> parameter controls the Data Optimizer authoritative versus non-authoritative behavior. Accepable values are <em>Enabled</em> and <em>Disabled</em>. The default value is Disabled.</td></tr></tbody></table>

### Volume Monitor Configuration for Cloudera Manager only

Use the following parameter to configure the Volume Monitor interval for Cloudera Manager.

<table><thead><tr><th width="201.7777099609375">Parameter</th><th>Description</th></tr></thead><tbody><tr><td><strong>MONITOR_INTERVAL</strong></td><td>Value used to specify how frequently, in minutes, the Volume Monitor checks the health of the Data Optimizer volume. As a best practice, set the interval to five minutes.</td></tr></tbody></table>

## Hitachi Content Platform configuration

Data Optimizer requires either Hitachi Content Platform or Hitachi Content Platform for cloud scale.

For both Content Platform and Content Platform for cloud scale, a single user defines who creates and owns all the Data Optimizer buckets. It is important for the security of the data in these buckets that the user credentials are not shared with any other application. For security, only an HDFS or Data Optimizer administrator should have access to credentials to create and define who owns the Data Optimizer buckets. The credentials are in the Data Optimizer configuration files on the HDFS DataNodes.

See the [**Hitachi Content Platform** product documentation](https://docs.hitachivantara.com/r/en-us/content-platform/9.6.x/mk-99arc026/hcp-overview/introduction-to-hitachi-content-platform) for more information.

**Note:** If you need to work with customer support to troubleshoot or resolve an issue, make sure that you share the Content Platform user credentials with them.

### Configure a tenant in Content Platform

To create a Content Platform tenant, you need the administrator role.

You must create a Hitachi Content Platform tenant for Data Optimizer. In most cases, Data Optimizer instances create their own buckets, so you need to properly configure namespace defaults to result in properly configured buckets.

Use the following steps to configure a tenant in Content Platform.

1. In the top-level menu of the Hitachi Content Platform System Management Console, click **Tenants**.

   The Tenants page opens.
2. On the Tenants page, click **Create Tenant**.

   The Create Tenant panel opens.
3. On the Create Tenant panel, create a tenant, making sure to:
   * Allocate enough quota for all anticipated Data Optimizer instances.
   * Enable versioning. See the **Hitachi Content Platform** product documentation for more information.
4. Use the following steps to enable the management API (MAPI), so that Data Optimizer instances can create buckets.
   1. Log into the System Management Console or Tenant Management Console using a user account with the security role.
   2. In the top-level menu of either console, select **Security** > **MAPI**.

      The Management API page opens.
   3. In the **Management API Setting** section on the Management API page, select **Enable the HCP management API**.
   4. Click **Update Settings**
5. Enable MAPI at the cluster level.
6. Use the following steps to configure namespace defaults for the tenant:
   1. From the Content Platform Tenant Management Console, select **Configuration** > **Namespace Defaults**.
   2. In the **Hard Quota** field, type a new number of gigabytes or terabytes of storage to allocate for an individual Data Optimizer instance namespace and select either GB or TB to indicate the measurement unit. The default is 50 GB. The maximum value you can specify is equal to the hard quota for the tenant.
   3. Set **Cloud Optimized** to **On**.
   4. Set **Versioning** to **On**.
   5. Enable version pruning older than 0 days.

### Create a tenant user account

Use this task in Hitachi Content Platform to create a tenant user account to be used exclusively by Data Optimizer, not by an actual user. This user owns and has exclusive data access permissions to Data Optimizer buckets.

**Note:** The tenant user must not have any administrative role in the tenant beyond administration of the buckets they own. No users should have access to the data in Data Optimizer buckets at any time for any reason except when required by customer support.

Use the following steps in the Content Platform Tenant Management Console to create a tenant user account. See the **Hitachi Content Platform** product documentation for more information.

1. Navigate to **Security** > **Users** > **Create User Account**.

   The Create User Account panel opens.
2. In the Create User Account panel, in the **Username** field, type a login account.

   Adhere to the following guidelines:

   1. Choose a name like `pdso-svc-usr`, to indicate that the user is not a person but a software service.
   2. Do not enable any administrative roles.
   3. Select **Allow namespace management**.

      You need to do this so Data Optimizer instances can create buckets.
3. Click **Create User Account**.

   The text “`Successfully created user account. Authorization token:`” is shown, followed by a text string with two values separated by a colon. The value on the left side of the text string is the base64-encoded username for the **ACCESS\_KEY** property, and the value on the right is the md5-encoded password to use for the **SECRET\_KEY** property.
4. Capture the base64-encoded username and md5-encoded password to add to the Data Optimizer configuration file.
5. Edit the `/etc/ldo`Data Optimizer configuration file and add the encoded username to the **ACCESS\_KEY** property and add the encoded password to the **SECRET\_KEY** property.
6. Save and close the configuration file.

### (Optional) Create a bucket for Data Optimizer

Use this task to manually create a bucket for the Data Optimizer instance.

**Note:** The best practice is to let Data Optimizer instances create their own buckets.

Perform the following steps in Hitachi Content Platform to create a bucket manually. See the **Hitachi Content Platform** documentation for more information.

1. In the Content Platform Tenant Management Console, click **Namespaces**.

   The Namespaces page opens.
2. On the Namespaces page, click **Create Namespace**.

   The Create Namespace panel opens.
3. Use the following steps to create a namespace:
   1. In the **Namespace Owner** field, specify the tenant user created in the [Create a tenant user account](#create-a-tenant-user-account) procedure.
   2. Configure **Hard Quota** to provide adequate capacity for an individual Data Optimizer instance.
   3. Set **Cloud Optimized** to **On**.
   4. Set **Versioning** to **On**.
   5. Enable version pruning older than 0 days.
4. Use the following steps to enable an access control list (ACL):
   1. In the Tenant Management Console, click **Namespaces**.

      The Namespaces page opens.
   2. In the list of namespaces, click the name of the Data Optimizer namespace.
   3. Click the **Settings** tab.

      The Settings panel opens.
   4. On the left side of the Settings panel, click **ACLs**.

      The ACLs panel opens.
   5. In the ACLs panel, select **Enable ACLs**.

      A confirmation prompt displays.
   6. Click **Enable ACLs**.
5. Use the following steps to enable the Hitachi API for Amazon S3:
   1. In the Tenant Management Console, click **Namespaces**.

      The Namespaces page opens.
   2. In the list of namespaces, click the name of the Data Optimizer namespace.
   3. Click the **Protocols** tab.

      The Protocols panel opens.
   4. Select **Enable Hitachi API for Amazon S3**.

      **Note:** Enable HTTP only if you will not be using TLS.
   5. Click **Update Settings**.
6. Specify the namespace name in the **BUCKET** parameter of the Data Optimizer configuration file, `/etc/ldo`.

## HCP for cloud-scale configuration

If you are using HCP for cloud-scale and configuring more than 100 Data Optimizer instances, you need to increase the maximum number of buckets allowed for your user.

See [**Hitachi Content Platform configuration documentation**](https://docs.hitachivantara.com/r/en-us/content-platform/9.6.x/mk-99arc026/hcp-overview) for more information.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pentaho.com/pdc-10.2-install/install-pentaho-data-catalog/install-pentaho-data-optimizer-in-hadoop-cluster/configure-data-optimizer.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
