Running and stopping Data Optimizer in Cloudera
Use the following best practices when starting or stopping Data Optimizer, its volumes, and other services in Cloudera.
Start and stop Data Optimizer
When you are stopping services for a single host or for all hosts in the cluster, as a best practice, always start the Data Optimizer volume component before the HDFS DataNode component and stop it after the HDFS DataNode component. Stopping the Data Optimizer volume while HDFS is running might lead to data availability issues or lead to transient HDFS volume failures. Starting HDFS when Data Optimizer is not running can negatively impact your operations.
If the DataNode is not put into maintenance mode, more blocks will be created to maintain the number of required replication copies. As a best practice, set the DataNode to a maintenance state as described in HDFS-7877 to avoid unnecessarily re-protecting a large number of blocks on a Data Optimizer volume when performing routine DataNode maintenance.
Start and stop Data Optimizer volumes
As a best practice, always start Data Optimizer volumes before the HDFS DataNodes and stop them after stopping the HDFS DataNodes. This sequence ensures optimal operation and prevents potential data availability issues. Like most services and service roles, the Data Optimizer service and the Volume role include start
and stop
commands that you can access and execute in multiple ways through Cloudera Manager.
Service-wide actions
Use the Start and Stop actions on the Data Optimizer service to start and stop all volume instances associated with the service simultaneously.
Individual volume actions
Start and stop individual volume instances through the Hosts tab. Drill down into an individual host and select Start, Stop, or Restart from the action menu for the specific volume instance.
Start and stop all services
Cloudera Manager does not provide a way to define dependencies between services or to influence the order when stopping or starting all services or when performing rolling restarts. Cloudera Manager is unaware that HDFS depends on Data Optimizer and assumes that Data Optimizer (like most services) is dependent on HDFS. Because of this, when using the cluster or host level Start
, Stop
, and Restart
commands, HDFS will be stopped after Data Optimizer, and started before Data Optimizer, which is not recommended. Be aware that the cluster Stop
command stops Data Optimizer volumes before it stops HDFS DataNodes.
The cluster Start command starts HDFS DataNodes before it starts Data Optimizer volumes. For this reason, always start the Data Optimizer service before using the cluster Start
command.
Last updated
Was this helpful?