# Perform automated data tiering

Policy Manager is installed on all nodes where Data Optimizer is running. You can run the commands on HDFS data nodes where Data Optimizer volumes are mounted to automatically tier data and move files. Perform the following steps from the command line interface (CLI) at a location accessible to the server.

1. Run `$cd ‘/opt/ldo’` to go to the Policy Manager working folder.
2. Create a policy configuration file, enter a unique name for the file, and then create the policies using a JSON format.

   See the following example to make sure the syntax is correct.

   ```
   $vi <policy file>   
   {
       "policy1" : {                              #1st Policy
           "name": "policy1",                  #Name of policy
           "path": "/data/path/to/scan",   # Which path in HDFS to scan files
           "tag": "COLD",                      # what tag to put for file
           "retention": 180                # number of days since file is last accessed
       }
   }
   ```

   A `policiesconfig.json` file is created, which scans files in the specified path then tags those not accessed for 180 days as `COLD`.
3. Run the following Policy Manager entries as a Hadoop user:

   ```
   cd /opt/ldo
   python policy_app_runner <policy file> <database file name>

   ```

   Where `<*policy file*>` is the name of the policy created in the previous step, and `<*database file name*>` is the location where you want to place the files.
4. (Optional) You can also update step 3 to schedule a cron job to run in Policy Manager by opening `/opt/ldo/policy_manager.sh` and adding the `crontab -e` command.

   For example, you can then add the following code in an opened file to run Policy Manager and execute a cron job at 12:00 AM on every Saturday of every month.

   ```
   0 0 * * 6 /opt/ldo/policy_manager.sh
   ```

   **Note:** See [CronJob](https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/) to add scheduling parameters that meet your specific needs.

Policy Manager scans the nodes, compares files, and then moves files tagged COLD into the specified Data Optimizer volume.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pentaho.com/pdc-10.2-data-optimizer/pdso-install-landing-page/pdso-install-in-hadoop-cluster/pdso-run-data-optimizer-landing-page/pdso-using-automated-data-tiering/pdso-perform-automated-data-tiering.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
