Manage data operations

In Pentaho Data Catalog, data operations refer to the various processes and strategies employed to optimize the storage and management of data across different environments. Data operations are designed to improve performance, reduce costs, and ensure efficient utilization of storage resources. You can monitor, verify, and examine details about your past, current, and pending data operations on the Data Operations page.

Tour the Data Operations page

In Data Catalog, the Data Operations page gives a user-friendly interface to view the data operations. To access the page, see View data operations. On the page, you can filter operations using the Completed, and Failed tabs. In addition, you can enter the keywords in Search to search in the Data Operations names and shows the results of the operations items.

Data Operations Landing Page

On the Data Operations page, you can view the list of data operations according to their status and you can filter the results using the Completed and Failed tabs. In addition, you can enter the keywords in Search to find names and view the results of specific operations. The following table describes the features and information available on the page.

Column
Description

More options icon. Click More options and choose Rehydration to rehydrate a file. For more information, see Rehydrate.

Name

Name of the file.

Path

Path of the file.

Source

The source of the file.

Type

The action performed or scheduled.

Icon

Definition

File tiered.

File purged.

File rehydrated.

Destination

The destination of the file.

Source Type

The source type of the file.

Destination Type

The data target type for the tiered or rehydrated file.

Status

The status of the data operation: - SUCCESS

  • INIT (Initializing)

  • FAILED

Tag

The tag applied to the file.

Action

The method used to begin the operation:- UI

  • RULE

File Format

The file format of data.

Size

The size of the file.

Message

The message returned about the data operation by Data Optimizer.

Started

Time the operation began.

Ended

Time the operation ended.

Time taken

Total time taken for the operation.

View data operations

You can use the Data Operations page to understand the results of the data operations. It provides a detailed view of all completed, failed, or terminated operations helping you ensure data integrity and traceability.

Perform the following steps to view data operations:

  1. On the left navigation menu, click Data Operations.

    The Manage Data Operations page appears.

  2. On the Data Operations card, click any one, Submitted, Completed, Failed, or Terminated.

    The Data Operations page appears displaying data operation records based on the selected status. You can see detailed metadata about each data operation. For more information, see Tour the Data Operations page.

Rehydrate

You can restore tiered files using rehydration. Rehydration uses the stub file created from tiering to restore the contents of the file. The stub file contains recall information that Data Catalog uses to rehydrate the file from the target to its original data source. You can selectively recall any file from its storage location and perform rehydration if a stub file exists in the file system.

Note:

  • When a file is tiered, the last access time of the file does not change.

  • You cannot rehydrate a purged or deleted file.

Rehydrate a migrated file

Perform the following steps to rehydrate a file, except for HDFS:

  1. On the Menu, click Management.

    The Manage Your Environment page opens.

  2. On the Data Operations card, click Submitted, Completed, or Failed.

    The Data Operations page appears.

  3. On the Data Operations page, locate the data operation or file that you want to restore, click the more icon, and then select Rehydrate.

    Note: You can also use Search to locate the file by name.

    Rehydration begins. You can monitor the status of the file’s rehydration on the Data Operations page.

Last updated

Was this helpful?