LogoLogo
CtrlK
Try Pentaho Data Integration and Analytics
  • Pentaho Documentation
  • Pentaho Data Optimizer
  • What's new in Pentaho Data Optimizer 10.2
  • Get started
    • Tour of the Home page
  • Use Data Optimizer
    • Explore your data
    • Assign data temperatures
    • Workflows
      • Tiering workflow
      • Purging workflow
  • Management
    • Data Sources
      • Viewing a data source
      • Importing a data source
      • Disabling a data source
    • Data Operations
      • Viewing data operations
    • Rules
      • Rule definitions
        • Creating a rule definition
        • Editing a rule definition
        • Deleting a rule definition
      • Metadata rules
        • Creating a metadata rule
        • Running a metadata rule
        • Editing a metadata rule
        • Deleting a metadata rule
    • Rehydration
      • Rehydrating a file
      • Rehydrating HDFS files
    • Default user roles and permissions in Data Optimizer
  • Install
    • Requirements
      • Environment considerations
      • System requirements
        • Checklist for infrastructure requests
        • Hardware requirements
        • AWS EC2 details
        • Azure VM details
        • Server storage requirements
      • Operating system requirements
        • Linux kernel version
      • Network security and firewall requirements
      • User account
      • Software requirements
        • Additional software
      • Data source connectivity
      • (Optional) Client Virtual Device Interface
    • Install Data Optimizer
      • Installing Data Optimizer offline
    • Install in Hadoop Cluster
      • Install
        • Installing Pentaho Data Optimizer on an Apache Ambari Cluster
          • Step 1: Download the Pentaho Data Optimizer Management Pack
          • Step 2: Install the Pentaho Data Optimizer Management Pack for Apache Ambari
          • Step 3: Install the mpack on the Apache Ambari server
          • Step 4: Link the mpack on the Apache Ambari server
          • Step 5: Add the Pentaho Data Optimizer service to the cluster
          • Step 6: Configure Pentaho Data Optimizer volumes to start automatically
          • Step 7: Configure HDFS to use the Pentaho Data Optimizer volume
          • Step 8: Restart HDFS datanodes after adding volumes
          • Step 9: Restart data nodes in Apache Ambari
        • Installing Pentaho Data Optimizer on a Cloudera Manager cluster
          • Download the Pentaho Data Optimizer software
          • Add the Pentaho Data Optimizer parcel to a parcel repository
          • Download the parcel to Cloudera Manager
          • Install the custom service descriptor on the Cloudera Manager server
          • Distribute and activate the parcel
          • Add the Pentaho Data Optimizer service to the cluster
          • Configure Data Optimizer volumes to restart automatically
          • Configure HDFS to use the Data Optimizer volume
          • Refresh HDFS Datanodes after adding Data Optimizer volumes
          • Data Optimizer extension for Cloudera Manager
      • Configure Data Optimizer
        • Data Optimizer configuration parameters
          • General Data Optimizer Configuration for Ambari
          • Settings for HTTP/S Proxy Connections
          • Recovery Specific Configuration for Ambari
          • Volume Monitor Configuration for Cloudera Manager only
        • Hitachi Content Platform configuration
          • Configure a tenant in Content Platform
          • Create a tenant user account
          • (Optional) Create a bucket for Data Optimizer
        • VSP One Object configuration
      • Run Data Optimizer
        • Running and stopping Data Optimizer
          • Running and stopping Data Optimizer in Ambari
          • Running and stopping Data Optimizer in Cloudera
        • Using automated data tiering
          • Perform automated data tiering
      • Maintain Pentaho Data Optimizer
        • Maintain Data Optimizer metadata
          • Data Optimizer recovery mode
            • Authoritative vs non-authoritative
          • Recovering from local metadata store failure or corruption
          • Restoring the metadata store to its authoritative state
          • Data Optimizer alerts
        • The ldoctl command line utility
        • Troubleshoot Data Optimizer
          • Access the Data Optimizer volumes directly
            • Troubleshooting using direct access
          • Tiering HDFS Blocks to Data Optimizer
          • Data Optimizer logging
            • Viewing logs on Cloudera
            • Viewing logs on Ambari
            • Viewing logs natively
            • Marking logs
            • Log format
            • Log levels
            • AWS S3 SDK logging
          • Data Optimizer configuration troubleshooting
          • Hitachi Content Platform (HCP) configuration troubleshooting
          • Data Optimizer does not start
        • Monitor Data Optimizer
Powered by GitBook
On this page

Was this helpful?

Export as PDF
  1. Install
  2. Install in Hadoop Cluster
  3. Install

Installing Pentaho Data Optimizer on a Cloudera Manager cluster

To install Data Optimizer on a Cloudera Manager cluster, use the following workflow:

Step 1: Download the Pentaho Data Optimizer software

Step 2: Add the Pentaho Data Optimizer parcel to a parcel repository

Step 3: Download the parcel to Cloudera Manager

Step 4: Install the custom service descriptor on the Cloudera Manager server

Step 5: Distribute and activate the parcel

Step 6: Add the Pentaho Data Optimizer service to the cluster

Step 7: Configure Data Optimizer volumes to restart automatically

Step 8: Configure HDFS to use the Data Optimizer volume

Step 9: Refresh HDFS Datanodes after adding Data Optimizer volumes

Step 10: Data Optimizer extension for Cloudera Manager

PreviousStep 9: Restart data nodes in Apache AmbariNextDownload the Pentaho Data Optimizer software

Last updated 24 days ago

Was this helpful?

LogoLogo

About

  • Pentaho.com

Support

  • Pentaho Support

Resources

  • Privacy

© 2025 Hitachi Vantara LLC