# Advanced Pentaho Data Integration topics

The following topics help to extend your knowledge of PDI beyond basic setup and use:

* [PDI and Hitachi Content Platform (HCP)](/pdia-data-integration/10.2-data-integration/advanced-topics-pentaho-data-integration-overview/pdi-and-hitachi-content-platform-hcp.md)

  You can use PDI transformation steps to improve your HCP data quality before storing the data in other formats, such as JSON , XML, or Parquet.
* [Hierarchical data](/pdia-data-integration/10.2-data-integration/advanced-topics-pentaho-data-integration-overview/hierarchical-data.md)

  You can manipulate structured, complex, and nested data types, and load filtered subsets of large JSON files.
* [PDI and Snowflake](/pdia-data-integration/10.2-data-integration/advanced-topics-pentaho-data-integration-overview/pdi-and-snowflake-cp.md)

  Using PDI job entries for Snowflake, you can load your data into Snowflake and orchestrate warehouse operations.
* [Metadata discovery](/pdia-data-integration/10.2-data-integration/advanced-topics-pentaho-data-integration-overview/copybook-steps-in-pdi-cp/metadata-discovery.md)

  You can use to automate the tedious process of manually identifying and determining metadata from Cobol Copybook and databases.
* [Use Streamlined Data Refinery (SDR)](/pdia-data-integration/10.2-data-integration/advanced-topics-pentaho-data-integration-overview/work-with-the-streamlined-data-refinery.md)

  You can use SDR to build a simplified and specific ETL refinery composed of a series of PDI jobs that take raw data, augment and blend it through the request form, and then publish it to use in Analyzer.
* [Use Command Line Tools](/pdia-data-integration/10.2-data-integration/advanced-topics-pentaho-data-integration-overview/use-command-line-tools-to-run-transformations-and-jobs.md)

  You can use PDI's command line tools to execute PDI content from outside of the PDI client.
* [Metadata Injection](/pdia-data-integration/10.2-data-integration/pdi-transformation-steps-reference-overview/etl-metadata-injection.md)

  You can insert data from various sources into a transformation at runtime.
* [Use Carte Clusters](/pdia-data-integration/10.2-data-integration/advanced-topics-pentaho-data-integration-overview/use-carte-clusters.md)

  You can use Carte to build a simple web server that allows you to run transformations and jobs remotely.
* [Connecting to a Hadoop cluster with the PDI client](/pdia-data-integration/10.2-data-integration/advanced-topics-pentaho-data-integration-overview/connecting-to-a-hadoop-cluster-with-the-pdi-client-article.md)

  Use transformation steps to connect to a variety of Big Data data sources, including Hadoop, NoSQL, and analytical databases such as MongoDB.
* [Partition Data](/pdia-data-integration/10.2-data-integration/advanced-topics-pentaho-data-integration-overview/partitioning-data.md)

  Split a data set into a number of sub-sets according to a rule that is applied on a row of data.
* [Use a Data Service](/pdia-data-integration/10.2-data-integration/advanced-topics-pentaho-data-integration-overview/pentaho-data-services.md)

  Query the output of a step as if the data were stored in a physical table by turning a transformation into a data service.
* [Use Data Lineage](broken://pages/UIVH7gpBZmc74uQLRHxJ)

  Track your data from source systems to target applications and take advantage of third-party tools, such as Meta Integration Technology (MITI) and yEd, to track and view specific data.
* [Use the Marketplace](/pdia-data-integration/10.2-data-integration/advanced-topics-pentaho-data-integration-overview/use-the-pentaho-marketplace-to-manage-plugins.md)

  Download, install, and share plugins developed by Pentaho and members of the user community.

**Note:** If you want to develop custom plugins that extend PDI functionality or embed the engine into your own Java applications, see the **Administer Pentaho Data Integration and Analytics** document.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pentaho.com/pdia-data-integration/10.2-data-integration/advanced-topics-pentaho-data-integration-overview.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
