> For the complete documentation index, see [llms.txt](https://docs.pentaho.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.pentaho.com/pdc-10.2-data-optimizer/pdso-install-landing-page/pdso-install-in-hadoop-cluster/pdso-install-landing-page-hadoop-installation/installing-data-storage-optimizer-on-apache-ambari-cluster/step-2-install-the-pentaho-data-storage-optimizer-management-packfor-apache-ambari.md).

# Step 2: Install the Pentaho Data Optimizer Management Pack for Apache Ambari

To add the Data Optimizer service to a cluster, you must first install the management pack (mpack) on the Apache Ambari server. Apache Ambari provides a framework for deploying and managing third-party services like Data Optimizer in an Apache or Hadoop cluster. The mpack for Apache Ambari defines the Data Optimizer service and its roles for Apache Ambari.

The Data Optimizer mpack contains metadata files that communicate to Apache Ambari what the Data Optimizer service is, the roles the service provides, and how the service is managed. For example, the mpack tells Apache Ambari which scripts to call to start or stop the roles associated with the service.

The mpack also contains the Data Optimizer code in the form of executable binaries and scripts. Apache Ambari executes Data Optimizer code according to the instructions provided in the mpack whenever you:

* start or stop the service or roles
* change log levels
* run instance recovery
* enable or disable recovery mode

Each host on a cluster has roles. Roles help determine the service that is installed and the location. The Data Optimizer mpack for Apache Ambari contains only a single role called **Volume. Instances** of this role are added to HDFS data nodes and enable the Data Optimizer tiering capability on those data nodes. For more information on the Volume role, see: [Access the Data Optimizer volumes directly](/pdc-10.2-data-optimizer/pdso-install-landing-page/pdso-install-in-hadoop-cluster/pdso-maintain-landing-page/troubleshoot-data-storage-optimizer-fs/accessing-pdso-volumes-directly.md).

To prepare Apache Ambari for the Data Optimizer installation, first download the Pentaho Data Optimizer mpack and then install it on the Apache Ambari server.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.pentaho.com/pdc-10.2-data-optimizer/pdso-install-landing-page/pdso-install-in-hadoop-cluster/pdso-install-landing-page-hadoop-installation/installing-data-storage-optimizer-on-apache-ambari-cluster/step-2-install-the-pentaho-data-storage-optimizer-management-packfor-apache-ambari.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
