# Read Metadata

[![Parent article](https://github.com/pentaho/documentation/blob/main/PDIA/9.3/PDI/Transformation%20steps/PDI%20transformation%20steps%20reference%20\(overview\)/ArticleUpIcon=GUID-0F51373E-9450-4228-A7A5-AA26C66111F5=1=en=Low.png)](/pdia-data-integration/9.3-data-integration/advanced-topics-pentaho-data-integration-overview/pdi-and-lumada-data-catalog-ldc.md)

You can use the Read Metadata step to search and retrieve any existing metadata in the Pentaho Data Catalog that is associated with specific Data Catalog registered data resources.

Specifically, you could create a transformation that searches Data Catalog for existing metadata that points to data stored in CVS files and Parquet files stored in HDFS or Amazon S3. You can then pass all the associated metadata, including the location of the data, to other steps within your transformation for processing.

For example, you could use the Read Metadata to retrieve the metadata for a data file's cluster location and then pass the metadata to a Text File input step or a Catalog Input step that retrieves the file’s contents for an ETL operation on the data. The transformation can then write the new data contents back to the file or to a new file.

The Read Metadata step includes search options to identify, locate, and retrieve the metadata associated with the available data resources listed in Data Catalog .

For more information about accessing Pentaho Data Catalog in PDI, see [PDI and Data Catalog](/pdia-data-integration/9.3-data-integration/advanced-topics-pentaho-data-integration-overview/pdi-and-lumada-data-catalog-ldc.md).

**Note:** This step is supported on the PDI engine but not on the Spark engine. Only CSV text file and Parquet data formats are currently supported. You must have role permissions set in Data Catalog to read the data resources.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pentaho.com/pdia-data-integration/9.3-data-integration/pdi-transformation-steps-reference-overview/read-metadata-pdi-step.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
