Running and stopping Data Optimizer in Ambari

Use the following best practices when starting and stopping Data Optimizer, its volumes, and other services.

Start and stop Data Optimizer

When you are stopping services for a single host or for all hosts in the cluster, as a best practice, always start the Data Optimizer volume component before the HDFS DataNode component and stop it after the HDFS DataNode component. Stopping the Data Optimizer volume while HDFS is running might lead to data availability issues or lead to transient HDFS volume failures. Starting HDFS when Data Optimizer is not running can negatively impact your operations.

If the DataNode is not put into maintenance mode, more blocks will be created to maintain the number of required replication copies. As a best practice, set the DataNode to a maintenance state as described in HDFS-7877 to avoid unnecessarily re-protecting a large number of blocks on a Data Optimizer volume when performing routine DataNode maintenance.

Start and stop Data Optimizer volumes

As a best practice, always start Data Optimizer volumes before the HDFS DataNodes and stop them after stopping the HDFS DataNodes. This sequence ensures optimal operation and prevents potential data availability issues. Like most services and service roles, the Data Optimizer service and the Volume role include start and stop commands that you can access and execute in multiple ways through Ambari.

  • Service-wide actions

    Use the Start and Stop actions on the Data Optimizer service to start and stop all volume instances associated with the service simultaneously.

  • Individual volume actions

    Start and stop individual volume instances via the Hosts tab. Drill down into an individual host and select Start, Stop, or Restart from the action menu for the specific volume instance.

Start and stop all services

The Data Optimizer Management Pack for Ambari includes a dependency definition indicating that the HDFS DataNode component depends on the Data Optimizer volume component. Due to this dependency, when using the cluster or host-level start, stop, and restart commands, Ambari ensures that HDFS is stopped before Data Optimizer and started after Data Optimizer. This order adheres to the recommended best practices.

As a result, it is safe to use the cluster-level start and stop actions, as Ambari automatically manages the correct sequence for starting and stopping services, maintaining the integrity and availability of your data.

Last updated

Was this helpful?