Evaluate and learn Pentaho Data Integration (PDI)
As you explore Pentaho Data Integration (PDI), you will be introduced to the major components, watch videos, work through hands-on examples, and read about the different features. Review the documentation and contact Pentaho sales support if you have questions.
In this topic
PDI basics
This section introduces basic PDI terms and concepts.
Get a basic understanding of what PDI does.
Watch a video on how PDI fits into the Business Analytics Platform.
Read about PDI architecture in the Pentaho Data Integration document.
Get acquainted with the PDI client
Spoon is the PDI design tool.
In this section, you set up Spoon, tour the interface, and learn perspectives.
Review the hardware and software requirements.
Download the trial version and install the Pentaho Suite.
Install PDI only. See Custom installation.
Configure Pentaho Server memory:
Start Pentaho Server:
Access the PDI client. See the Pentaho Data Integration document.
Tour Spoon perspectives. See the Pentaho Data Integration document.
Review terminology and basic concepts in the Pentaho Data Integration document.
Build transformations and jobs
Once your environment is ready, start building.
Create a connection to the Pentaho Repository.
Work through Pentaho Data Integration (PDI) tutorial.
Create a job to execute a transformation.
Schedule a job to run later.
Explore Big Data and Streamlined Data Refinery
Learn to connect to big data sources like Hadoop, NoSQL, and analytical databases.
Watch one of the Big Data videos.
Learn Streamlined Data Refinery. See Pentaho Data Integration.
Learn auto-modeling with the Build Model job entry. See Pentaho Data Integration.
Review big data steps included with Spoon. See Commonly used PDI steps and entries.
Review supported Hadoop distributions and configuration. See Pentaho, big data, and Hadoop.
Edit transformations and metadata models. See Pentaho Data Integration.
Watch a video on using PDI to blend Big Data.
To follow the Hadoop configuration guidance, have a cluster available.
About Kitchen, Pan, and Carte
Kitchen, Pan, and Carte are command-line tools for running transformations and jobs.
Use Pan and Kitchen to run transformations and jobs.
Use Carte to run work remotely or in a cluster:
Run transformations and jobs on a Carte cluster.
Schedule jobs to run on a remote Carte server.
Start or stop Carte from the command line or a URL.
Run repository-based transformations and jobs on a Carte server.
See the Pentaho Data Integration document for details.
Learn more
After your first evaluation, go deeper.
Use newer steps and entries, like Spark Submit. See Pentaho Data Integration.
Turn a transformation into a data service. See Pentaho Data Integration.
Use the ETL Metadata Injection step. See Pentaho Data Integration.
Review the What's New document.
Create other data integration solutions. See Pentaho Data Integration.
Administer PDI. See the administration documentation.
Integrate with security protocols like Pentaho security, LDAP, MSAD, and Kerberos. See the administration documentation.
Explore the developer center section in the administration documentation.
Last updated
Was this helpful?

