General Data Optimizer Configuration for Ambari

CAUTION:

Never modify the BUCKET and MOUNT_POINT parameters in the Data Optimizer configuration file after the initial installation. Changing these values after installation breaks the instance because the Data Optimizer instance ID is calculated based on the values provided in these parameters.

Note: Do not include leading or trailing spaces if you copy and paste parameter values. Ambari and Cloudera Manager do not validate input.

Parameter

Description

ENDPOINT_TYPE

The type of S3 endpoint you are using. Acceptable, case sensitive values are HCP, HCPCS, and AWS. The default value is HCP. - If connecting to Hitachi Content Platform, use HCP.

  • If connecting to Virtual Storage Platform One Object, use HCPCS.

  • If connecting to Amazon S3, use AWS.

AWS_REGION

The AWS region that Ambari connects to. The AWS_REGION value is required if S3 Endpoint Type is AWS.

ENDPOINT

The S3 endpoint URL for the object storage service.- If the ENDPOINT_TYPE is HCP, use the form tenant.hcp_dns_name.

  • If the ENDPOINT_TYPE is HCPCS, use the form hcpcs_dns_name.

  • If the ENDPOINT_TYPE is AWS, you can leave the field blank or populate it with a region-specific S3 endpoint.

BUCKET

S3 bucket used on the object store for all the backend storage of the Data Optimizer instances.

ACCESS_KEY

S3 Access Key ID used to authenticate S3 requests to the object store.

SECRET_KEY

S3 Secret Key used to authenticate S3 requests to the object store.

ENDPOINT_SCHEME

S3 Connection Scheme or Endpoint Scheme. Acceptable, case sensitive values are https and http. The default value is https. If set to https, Data Optimizer uses TLS to encrypt all communication with object storage.

VERIFY_SSL_CERTIFICATE

Value used to specify whether to verify certificates within the Data Optimizer volume. Acceptable, case sensitive values are Enabled and Disabled. The default value is Enabled. |If the ENDPOINT_SCHEME parameter is:|Then set the VERIFY_SSL_CERTIFICATE parameter to:| |-----------------------------------------|-------------------------------------------------------| |https|Enabled| |https and the object store certificate is self-signed|Disabled|

By default, Content Platform uses a self-signed certificate that is not in the trust store on the HDFS DataNode. Disabling verification allows TLS negotiation to occur, despite the untrusted certificate. Disabling verification does not reduce the strength of TLS encryption, but it does disable endpoint authentication. It is a best practice to replace the Content Platform self-signed certificate with one signed by a trusted certificate authority. See the Hitachi Content Platform documentation for details.

MOUNT_POINT

HDFS DataNode local directory where the Data Optimizer instance is mounted. HDFS writes block replicas to the local directory you specify. The MOUNT_POINT parameter value must be a fully-qualified directory path starting at the system root (/).

VOLUME_STORAGE_LIMIT_GB

The storage capacity in GB of each Data Optimizer volume instance. If the combined usage of Data Optimizer volumes exceeds the quota allocated to their shared bucket on Content Platform, writes to those Data Optimizer volumes fail. The VOLUME_STORAGE_LIMIT_GB parameter value, multiplied by the number of Data Optimizer instances should not exceed the Content Platform quota. In fact, the Content Platform quota should include additional capacity for deleted versions and to account for asynchronous garbage collection services. HDFS writes only the amount of data to each Data Optimizer volume that is equal to or less than the amount specified in the HCP Bucket Storage Limit parameter, minus the reserved space (dfs.datanode.du.reserved).

CACHE_DIR

A local directory on the HDFS DataNode that Data Optimizer uses to store temporary files associated with open file handles. The CACHE DIR parameter must be a fully-qualified directory path starting at the system root (/).

MD_STORE_DIR

Local directory on each node used to store files associated with persisting the Data Optimizer local metadata store. The MD_STORE_DIR parameter value must be a fully-qualified directory path starting at the system root (/). Specify a value for MD_STORE_DIR when the CACHE_DIR directory is located is on volatile storage or if there is a more durable location for long term file persistence. Do not choose a volatile storage medium for this directory as it is intended to persist for the life of the Data Optimizer volume. If the files associated with metadata store persistence are lost or corrupted, you can recover them as explained in Recovering from local metadata store failure or corruption.

LOG_LEVEL

Value used to specify how verbose the logging is for Data Optimizer. The default value is WARNING. Acceptable, case sensitive values are ALERT, ERR, WARNING, INFO, and DEBUG. See Data Optimizer logging for details about logging and log levels.

LOG_SDK

Optional. Local directory where detailed AWS S3 logs are saved. If the LOG_SDK parameter is specified and if LOG_LEVEL is set to DEBUG, Data Optimizer volumes log details about the S3 communication between the Data Optimizer volume instance and Content Platform. The LOG_SDK parameter value must exist, must be a fully-qualified directory path starting at the system root (/), and the HDFS user using Data Optimizer must have write permission for the directory. See AWS S3 SDK logging for further details.

**Note:** The configuration file is located in the `/etc/ldo` directory on each HDFS DataNode on which both the Data Optimizer is installed, and the **ARCHIVE** volumes are configured.

Last updated

Was this helpful?