About Pentaho workflows
The route that you take with Pentaho depends on your expertise, business needs, and data. It also depends on what you want to analyze and report. During an evaluation, you might use both tracks.
All of the products are integrated to work smoothly together, regardless of which track you ultimately choose. We provide specific details within the workflow discussions, however, here are the high-level use cases for each track.
Business Analytics (BA) Track: Great for analysis and reporting. Meant primarily for business users and does not require special skills to successfully use the components involved. This track enables anyone to build Pentaho solutions without using programming or having deep understanding of data structures.
Data Integration (DI) Track: Meant for data design professionals and requires a working knowledge of data structures and modeling, as well as extract, transform, and load (ETL) processes. With this track, you can directly manipulate data from multiple sources, making it scalable and efficient for enterprise-wide analysis and reporting.
Each track has three workflows: one for Evaluation, one for Development, and one for Production.
Evaluate and Learn: If you used the trial download on the Pentaho website and want to get a hands-on feel for the components that are best for your implementation, follow the Evaluation Workflow.
Develop Pentaho Solutions: After you have figured out which components are best for you and how to use them, the Develop Workflow is the process you use to build, change, and test Pentaho solutions until they meet your production requirements.
Go Live for Production: When your solution is working just right, the Go Live Workflow shows how to move your solution from development to production.
In this topic
Prepare for the evaluation
This table guides you through the differences between the Business Analytics and Data Integration tracks. It also helps you decide which track to follow for evaluation. You may choose to follow one track, then the other, while you are exploring the software.
Expertise
No special skills required
Knowledge of business requirements and what reports and analysis should show
Knowledge of business requirements
Understanding of data structures and modeling
Knowledge of extract, transform, and load (ETL) processes
Data Set Description
Single source of data
Data from multiple sources that have been transformed and joined into a single data mart or warehouse
Small data sets
Multiple sources of data
Data you want to transform and join in one or more data marts or warehouses
Large to enormously vast data sets
Reporting Options
Offers a wide variety of visualization and reporting options.
Offers more limited but focused reporting options that help you visualize and analyze data. BA tools can be used to generate reports based on DI-processed data.
Data Storage Types
Relational databases
CSV data sources
SQL queries
Relational databases
NoSQL or Hadoop databases
Big data of any types
Data from a web service
Recommendation
Best used by business analysts, managers, report designers, individual business units within an organization or enterprise
Best used by data scientists, data modelers, data integration and ETL developers, individual business units within an organization or enterprise, and enterprise-wide implementations
Now that you have an idea of which track you want to follow for evaluation, choose an evaluation method. This decision table explains the different options for evaluation so you can pick the option that works best for you.
Track
Business Analytics
Business Analytics or Data Integration
Business Analytics or Data Integration
Summary
A cloud-based, hands-on, interactive exploration of Business Analytics reports, analysis, visualizations, and dashboards. Here you can see how easy and fun it can be to use Pentaho.
Work with Pentaho analysts and data integration specialists to plan and build a complimentary custom prototype that illustrates what Pentaho can do with your data. A representative will guide you through the entire process.
Using our trial software, tutorials, and documentation, install and configure your own work environment. Then, build a prototype to get a complete Pentaho experience from installation and administration, through creating your first data models and build reports, analysis, dashboards, and data integration ETL transformations.
Data Source
Pentaho sample data in CSV format
Your sample data, including a range of typical data characteristics in CSV format.
Your sample data, including a range of typical data characteristics in the format that you commonly use
Hardware/Software Requirements
Web browser
Varies, depending on your requirements.
One computer that meets the server requirements stated in the Components Reference.
Recommendation
Any evaluator who wants an overview of Business Analytics features.
Recommended for business analysts and report designers.
We recommend that you try out the Custom Prototype or Trial Download after you do the hosted demo.
All evaluators, particularly any big data or Data Integration evaluators.
Recommended for evaluators who want to explore Business Analytics and Data Integration features using a subset of their own data.
Limited to first-time customers only.
Any evaluator who wants to independently work with Business Analytics, Data Integration tools, and big data. - Recommended for evaluators who want to explore Business Analytics and Data Integration features using their own data.
Technical support is available to help if you have questions.
Pentaho Data Integration workflows
Pentaho Data Integration is a robust extract, transform, and load (ETL) tool that you can use to integrate, manipulate, and visualize your data. You can use PDI to import, transform, and export data from multiple data sources, including flat files, relational databases, Hadoop, NoSQL databases, analytic databases, social media streams, and operational stores. You can also use PDI to clean and enrich the data, move data between databases, and to visualize your data.
In this topic
Evaluate and learn PDI
As you explore Pentaho Data Integration (PDI), you will be introduced to the major components, watch videos, work through hands-on examples, and read about the different features.
Review the documentation and contact Pentaho sales support if you have questions.
PDI basics
This section familiarizes you with PDI and introduces you to basic terminology and concepts. Then, you learn how to start and configure Spoon and take a spin through the interface.
Get a basic understanding of what PDI does.
View a video that explains how PDI fits into the Business Analytics Platform.
Read about Pentaho Data Integration architecture in the Pentaho Data Integration document.
Get acquainted with the PDI client
Spoon is the PDI design tool. In this section you will set up Spoon, take a tour of the Spoon interface, and learn about the different Spoon perspectives.
Check out the hardware and software requirements for PDI.
Download trial version of the Pentaho Suite and install the software. (The platform includes PDI.)
Learn how to install PDI only. See Custom installation for details.
Configure the Pentaho Server. Depending on your platform, see Increase Pentaho Server memory limit for installations on Linux or Increase Pentaho Server memory limit for installations on Windows for details.
Start the Pentaho Server. Depending on your platform, see Start and stop the Pentaho Server for configuration on Linux or Start and stop the Pentaho Server for configuration on Windows for details.
Access the PDI client. See the Pentaho Data Integration document for details.
Tour the PDI client perspectives. See the Pentaho Data Integration document for details.
Read about terminology and basic concepts in the Pentaho Data Integration document.
Build transformations and jobs
Now that your environment is set up and you are familiar with the PDI client, you are ready to build transformations and jobs. Trying the following task may be helpful.
Create a connection to the Pentaho Repository.
Work through the exercise on Creating a Transformation that involves a flat file. Click through the links at the bottom of the page to complete the exercise.
Create a job to execute the transformation.
Schedule a job to execute the transformation at a later time.
Explore Big Data and Streamlined Data Refinery
In this section, you will learn how to use transformation steps to connect to a variety of big data sources, including Hadoop, NoSQL, and analytical databases such as MongoDB. You can then try working through the detailed, step-by-step tutorials, and peruse the out-of-the-box steps that Spoon provides. Learn how to work with Streamlined Data Refinery. Then, you will have an opportunity to move beyond the basics and learn how to edit transformations and metadata models.
Watch one of our Big Data Videos.
Learn how to work with Streamlined Data Refinery. See Pentaho Data Integration for details.
Learn how to auto model using the Build Model. See Pentaho Data Integration for details. job entry and how this feature intersects with Analyzer.
Find out what big data steps are available out-of-the-box. See Commonly used PDI steps and entries for details.
Find out which Hadoop distributions are available and how to configure them. See Pentaho, big data, and Hadoop for details.
Note: You should already have a cluster set up to perform this task.
Edit transformations and metadata models. See Pentaho Data Integration for details.
Watch a video about how to use PDI to blend Big Data.
About Kitchen, Pan, and Carte
Kitchen, Pan, and Carte are command line tools for executing transformations and jobs modeled in the PDI client.
Use Pan and Kitchen command line tools to work with transformations and jobs
Use Carte clusters to:
Run transformations and jobs on a Carte cluster.
Schedule jobs to run on a remote Carte server.
Start or stop Carte from the command line interface or a URL.
Run transformations and jobs from the repository on the Carte server
See the Pentaho Data Integration document for details on Kitchen, Pan, and Carte.
Learn more
Now that you have completed an initial evaluation of PDI, dig a little deeper. Find out how to:
Use newer steps and entries, like Spark Submit. See the Pentaho Data Integration document for details.
Read about how to turn a transformation into a data service. See the Pentaho Data Integration document for details.
Use the ETL Metadata Injection step. See the Pentaho Data Integration document for details.
Check out our What's New document.
Create other Data Integration solutions. See the Pentaho Data Integration document for details.
Administer PDI. See the administration documentation for details.
Integrate with different security protocols, like Pentaho security, LDAP, MSAD, and Kerberos. See the administration documentation for details.
Check out our developer center section in the administration documentation.
Develop your PDI solution
This workflow helps you to set up and configure the DI development and test environments, then build, test, and tune your Pentaho DI solution prototype. This process is similar to the trial download evaluation experience, except that you will be completely configuring the Pentaho Server for data integration and working with your own ETL developers.
If you need extra help, Pentaho professional services is available. The end result is to learn DI implementation best practices and deploy your DI solution to a production server. Most development and testing for DI occurs in Spoon.
Before you begin developing your DI solution, we recommend that you attend Pentaho training classes to learn how to install and configure the Pentaho Server, as well as how to develop data models.
This section is grouped into parts that will guide you during the development of your DI solution. These parts are iterative and you might bounce between them during development. For example, as you tune a job, you might find that although you have built a solution that produces the right results, it takes a long time to run. You might need to rebuild and test a transformation to improve efficiency, and then retest it.
Design DI solution
Design helps you think critically about the problem you want to solve and possible solutions. Consider these questions as you gather your requirements and design the solution.
Output
What does the overall solution look like? What questions are posing and how do you want the answers formatted?
Data Sources
What type(s) of data sources are you querying? Where are they located? How much data do you need to process? Are you using big data? Are you using relational or non-relational data sources? Will you have a target data source? If so, where are they located?
Content/Processing
What data quality issues do you have? How is the input data mapped to the output data? Where do you want to process the content, in PDI or in the data source? What hardware will you include in your development environment? Will you need one or more quality assurance test environments or production environments?
Also, consider templates or standards, naming conventions, and other requirements of your end users if you have them. Consider how you will back up your data as well.
Set up a development environment
Setting up the environment includes installing and configuring PDI on development computers, configuring clustering if needed, and connecting to data sources. If you have one or more quality assurance environments, you will need to set those up also.
Verify System Requirements
Consult the following references to verify requirements:- Components Reference
Acquire one or more servers that meet the requirements.
Obtain the correct drivers for your system.
Obtain Software and Install PDI
See the Install Pentaho Data Integration and Analytics document for following instructions:- Installing PDI
Starting the Pentaho Server
Starting the PDI client (also known as Spoon)
Get the software from your Sales Support representative.
Install the software.
Start the Pentaho Server and Spoon.
Install licenses for the Pentaho Server
See the Administer Pentaho Data Integration and Analytics document for instructions on installing licenses.
Add all acquired Pentaho licenses.
Connect to the Pentaho Repository
See the Pentaho Data Integration for instructions on connecting to the Pentaho Repository.
Connect to the Pentaho Repository.
Apply Advanced Security (if needed)
See the Administer Pentaho Data Integration and Analytics document for details on Advanced Security.
Determine whether you need to apply Advanced Security.
Build and test solution
During this step, you develop transformations, jobs, and models, then test what you have developed. You will tune the transformations, jobs, and models for optimal performance.
Development occurs in the PDI client design tool. The PDI client's streamlined design tightly couples the build and test activities so that you can easily perform them iteratively. The PDI client has perspectives to help you perform ETL and visualize data. The PDI client also provides a scheduling perspective that can be used to automate testing. Testing encompasses verifying the quality of transformations and jobs, reviewing visualizations, and debugging issues. One common method of testing is to include steps in a transformation or job that calculates hash totals, checksums, record counts, and so forth to determine whether data is being properly processed. You can also visualize your data in analyzer and report designer and review the results as you develop. This can not only help you find errors and issues with processing but can help you get a jump on user acceptance testing if you show these reports to your customers or business analysts to get early feedback.
One basic question is how you can determine the number of transformations and jobs needed, as well as the order in which they should be executed. A good rule of thumb is to create one transformation for each combination of source system and target tables. You can often identify combinations in your mapping documents. Once you have identified the number of transformations that you need, you can use the same process to determine that number of jobs that you need. When considering the order of execution for transformations and jobs, consider how referential integrity is enforced. Run target table transformations that have no dependencies first, then run transformations that depend on those tables, and so forth.
Understand the Basics
Read the overview of the PDI client process in the Pentaho Data Integration document.
Review information about the process and perspectives.
Review most often used steps and entries
Review commonly-used steps and entries.
Review available transformations and determine how you can use them for your solution.
Review job step references to identify which steps can be used in your solution.
Create and Run Transformations
Create and run a transformation. See the Pentaho Data Integration document for details.
Identify the transformations needed for your job and implement them.
Save transformation.
Run transformations locally.
Create and Run a Job
Create and run a job. See the Pentaho Data Integration document for details.
Create a job.
Arrange transformations in a job so that they execute logically.
Run a job.
Tune solution
Fine tune transformations and jobs to optimize performance. This involves using various tools such as the DI Operation and Audit Mart to determine where bottlenecks or other performance issues occur, and addressing them.
Review the Performance Tuning Checklist and Make Changes to Transformations and Jobs
Review tuning tips. See the Administer Pentaho Data Integration and Analytics document for tuning tips.
Get familiar with things that you can do to optimize performance.
Apply tuning tips as needed.
Consider other performance tuning options
Read about transactional databases. See the Pentaho Data Integration document for details on transactional databases.
Read about using logs. See the Administer Pentaho Data Integration and Analytics document for details on logging.
Learn how to apply transactional databases.
Learn how to use logs to tune transformations and jobs.
Next steps
These resources will be helpful to you as you prepare to Go Live for Production:
Prepare to Go Live for Production - DI.
Support Portal: check with Support for service packs.
Go Live for production - DI
Go Live is the process by which you migrate a prototype to production. This process is divided into four parts:
Setting up the production environment
Deploying the solution
Tuning the solution
Scheduling the runs
Set up production environment
Setting up the environment includes installing the software on production computers, configuring clustering, and connecting to data sources. To set up the environment, install and configure the Pentaho Server, Spoon, and any plugins required. Then set up data sources and clusters.
Verify system requirements
Consult the Components Reference.
Consult the JDBC Drivers Reference.
Acquire one or more servers that meet the requirements.
Obtain the correct drivers for your system.
Obtain software and install the Pentaho Server
Download the Pentaho software.
Start the Pentaho Server. See Install Pentaho Data Integration and Analytics for details.
Start the PDI client. See Pentaho Data Integration for details.
Install the licenses (if necessary). See Administer Pentaho Data Integration and Analytics for details.
Get the software from your Sales Support representative.
Install the software.
Change the Server Fully Qualified URL
Change the ports and URLs. See Administer Pentaho Data Integration and Analytics for details.
Change the server's URL so that you do not have a conflict.
Connect to the Pentaho Repository
Create a connection to the Pentaho Repository. See Pentaho Data Integration for details.
Connect to the Pentaho Repository.
Set up clusters
Optional: Set up clusters. See Pentaho Data Integration for details.
Become familiar with clustering.
Set up clusters, if they are needed in your environment.
Copy configuration files
Copy shared.xml, repositories.xml, kettle.properties, and JAR files from the development environment to the production environment.
System is set up and ready for production.
Logging and monitoring your server
Review logging and monitoring operations. See Pentaho Data Integration for details.
Enable logging. See Administer Pentaho Data Integration and Analytics for details.
Monitor PDI and SNMP traps. See Administer Pentaho Data Integration and Analytics for details.
Learn about the different ways to log and monitor Pentaho Server operations:
Log through Spoon and Carte
Use SNMP traps with PDI
Deploy solution
Export solutions from the Pentaho Repository that is in the development or test environments, to the Pentaho Repository that is in the production environment.
Export and Import Pentaho Repository
See Export and Import Pentaho Repository Content in the Administer Pentaho Data Integration and Analytics document.
Export Pentaho Repository content from test environment
Import Pentaho Repository content to production environment
Tune solution
Fine tune transformations and jobs to optimize performance. This involves using various tools such as the DI Operations and Audit Marts to determine where bottlenecks or other performance issues occur, and attempting to address them.
Review the Performance Tuning Checklist and Make Changes to Transformations and Jobs
Consult the tuning tips. See the Administer Pentaho Data Integration and Analytics document for tuning tips.
Get familiar with things that you can do to optimize performance.
Apply tuning tips as needed.
Consider other performance tuning options
Learn about transactional databases. See the Pentaho Data Integration document for details on transactional databases.
Learn about using logs. See the Administer Pentaho Data Integration and Analytics document for details on logging.
Learn how to apply transactional databases.
Learn how to use logs to tune transformations and jobs.
Schedule runs
Use the PDI client, Pan, or Kitchen to schedule executions of transformations and jobs.
Schedule Transformations and Jobs From Spoon
Schedule transformations and jobs. See the Pentaho Data Integration document for details.
Schedule transformations and jobs
Command Line Scripting Through Pan and Kitchen
Learn about Pan's options. See the Pentaho Data Integration document for details.
Learn about Kitchen's options. See the Pentaho Data Integration document for details.
Use Pan and Kitchen to schedule transformations and jobs.
Next steps
These resources will be helpful to you after your production server is live.
Fine-tune Pentaho systems: Provides guidance on how to maintain and fine-tune your Pentaho Server. See the Administer Pentaho Data Integration and Analytics document for details.
Pentaho Training and Education
Support Portal: Check with support for service packs.
Commonly used PDI steps and entries
Although there are over 330 transformation steps and job entries, some steps and entries are used more often than others. If you are creating a transformation and job, but do not know where to begin, this list might be helpful to you.
Top ten transformation steps
PDI transformation steps are documented in Pentaho Data Integration.
Text File Input
Table Input
Microsoft Excel Input
Text File Output
Table Output
Microsoft Excel writer
Select Values
Filter Rows
Group By
Stream Lookup
Other commonly used transformation steps
PDI transformation steps are documented in Documentation
INPUT: Generate Rows, Data Grid, Get Data from XML, CSV File Input, Fixed File Input
OUTPUT: XML Output
TRANSFORM: Split Fields, Calculator, Add Constants, Add Sequence, Replacing Strings, Split Fields, Sort Rows, String Operations, Strings Cut
SCRIPTING: User Defined Java Class, Modified Java Script Value, User Defined Java Expression
FLOW: Abort, Append Streams, Block this step until steps finish, Blocking Step, Detect Empty Stream, Dummy, ETL Metadata Injection, Filter Rows, Identify Last Row in a Stream, Java Filter, Job Executor, Prioritize Streams, Single Threader, Switch/Case, Transformation Executor
LOOKUP
JOINS: Join Rows, Merge Join
JOB: Get Variables, Set Variables
Commonly used job entries
PDI job entries are documented in documentation.
GENERAL: Start, Job, Transformation, Success
UTILITY: Abort
MAIL: Mail
FILE MANAGEMENT: Add filenames to result, Compare folders, Convert file between Windows and Unix, Copy Files, Create a folder, Create file, Delete file, Delete filenames from result, Delete files, Delete folders, File Compare, HTTP, Move Files, Process result filenames, Unzip file, Wait for file, Write to file, Zip file
UTILITIES: Write to log
Pentaho Business Analytics workflow
Pentaho Business Analytics is a combined business analytics and data integration platform that allows business users, data scientists, and IT administrators to easily access, explore, and visualize their data. Pentaho empowers business users to make information-driven decisions that positively impact their organization’s performance, data scientists to use a full-spectrum of tools to create robust data models, and IT to rapidly deliver a secure, scalable, flexible, and easy to manage business analytics platform for the broadest set of users.
Workflow stages
Use these sections to move from evaluation to production:
Evaluate and learn Pentaho Business Analytics
As you explore Pentaho Business Analytics, you will be introduced to the major components, watch videos, work through hands-on examples, and learn about the different features.
Go at your own pace. Feel free to dig into the documentation or to contact Pentaho sales support if you have questions.
Use the sections below to get familiar with Business Analytics:
Tour the User Console and create your first reports
The User Console is a web-based design environment where you can analyze data, create interactive reports, dashboard reports, and build integrated dashboards to share business intelligence solutions with others in your organization and on the internet. In addition to its design features, the User Console offers a wide variety of system administration features for configuring the Pentaho Server, managing Pentaho licenses, setting up security, managing report scheduling, and tailoring system performance to meet your requirements.
If you have installed the trial download on your laptop or desktop machine, you are ready to get started exploring. If you have the software installed on a server, and want to use your machine to point to it, see Develop your BA environment for details.
Tour the User Console
Understand the features of the User Console
View the sample reports on the Samples tab of the Getting Started section
Create Your First Reports and Dashboards
Created and saved an Interactive Report
Created and saved an Analysis Report
Created and saved a custom Dashboard
Schedule Your Report
Learn about scheduling reports. See the Pentaho Business Analytics document for details.
Scheduled a report to run and email automatically.
Received your report through email after the schedule runs.
Explore and learn data source basics
If you have already worked with the Steel Wheels sample data and want to learn how to create your own data sources and data models with Pentaho, use the Data Source Wizard. The Data Source Wizard helps you define a data source that contains the data you want to use and guides you through the creation of your evaluation data model for use in creating reports.
After you define a data source, you can make it available to other evaluators so they can create reports and analysis by simply picking the data source from the data source list. Any number of reports can be created using a single data source.
Create Your First Data Source
Create a Data Source
Tour the Data Source Wizard
See Pentaho Business Analytics for instructions.
Understand how the Pentaho Server and Data Source Wizard work together to create usable data sources and data models.
Explore the Data Source Wizard interface.
Learn the basics of creating a data source using the Data Source Wizard.
Choose Data Source Types
Choose a data source type
See Pentaho Business Analytics for instructions.
Learn about the different data source types supported by the Data Source Wizard.
We recommend using a CSV data source for evaluation.
Create Your First CSV Data Source
Create a CSV data source
See Pentaho Business Analytics for instructions.
Import a CSV data file using the Data Source Wizard.
Create the CSV data source.
We recommend creating a report using this new CSV data source, then refining the data model with the Data Source Model Editor as needed.
Refine Your Data Source Model
Edit multidimensional data source models.
See Pentaho Business Analytics for instructions.
(Optional) Edit your evaluation data source model using the Data Source Model Editor.
Inline Model Editing
Read Working with Analyzer measures in the Pentaho Business Analytics document.
Learn how to edit your data models while working in Analyzer.
Learn about Streamlined Data Refinery
Learn how to work with Streamlined Data Refinery
See Pentaho Data Integration for instructions.
Learn how Streamlined Data Refinery works.
Learn about Report Designer
Pentaho Report Designer is a report creation tool that you can use by itself, or as part of the Pentaho Suite. It allows professionals to create print-quality reports based on data from virtually any type of data source.
These resources in the Pentaho Report Designer document will help you get familiar with the Report Designer interface, and guide you through the creation and publishing of a print-quality report.
Explore the Report Designer Interface
Explore Report Designer
Tour the Report Designer interface before you begin building reports.
Report Designer Workflow Overview
Learn about Report Designer workflow
Look over the workflow concepts for Report Designer.
Create Your First Report
Create your first print-quality report
Create a report.
Add a chart and parameters to your report.
View and then publish your report.
Refine the Look of Your Report
Design print-quality reports
Explore more advanced features of Report Designer, beginning with report elements.
Add a PDI Data Source
Add a PDI data source
Add a PDI data source and use it to create a report in Report Designer.
Discover more about Pentaho Business Analytics
The Pentaho Analyzer, Interactive Reports, and Dashboard Designer plugins provide in-depth details about creating eye-catching business intelligence deliverables for your user community. See the Pentaho Business Analytics document for details.
If you are a system administrator, check out the Install Pentaho Data Integration and Analytics document. Both have details on configuring and administering your Pentaho Server using the User Console, as well as a section on the variety of things you can do to maintain your server manually.
Next steps (evaluation)
Contact Pentaho to learn more about how Business Analytics can be tailored to meet your business needs.
Continue with Develop your BA environment.
Develop your BA environment
This workflow outlines how to set up a Pentaho Server for BA development. It also covers how to build, refine, and test BA content.
This workflow is similar to the Trial Download Evaluation experience. The difference is you configure the server fully. You also work with your own report designers and data scientists. You can also engage Pentaho professional services.
Before you start, consider Pentaho training classes. Training helps you install and configure the server. Training also helps you build data models and BA applications.
Set up your Pentaho Server
Use this checklist to verify requirements. Then install and configure the Pentaho Server and BA design tools.
Verify system requirements
Review required components in Components Reference.
Review required drivers in JDBC drivers reference.
Acquire one or more servers that meet requirements.
Obtain the correct drivers for your system.
Obtain software and install the Pentaho Server
Download the Pentaho software from your Sales Support representative.
Install the software using Install the 30-day trial of Pentaho Data Integration and Analytics.
Sign in using Quick tour of the Pentaho User Console.
Tour Administration.
Change the default administrator password.
Change the Pentaho Server fully qualified URL
Follow Administer Pentaho Data Integration and Analytics instructions to change the server URL.
If multiple machines point to one server, confirm all clients use the new URL.
Configure the Pentaho Server
Manage licenses. See Administer Pentaho Data Integration and Analytics.
Configure server data connections. See Install Pentaho Data Integration and Analytics.
Configure email for scheduled reports. See Pentaho Business Analytics.
Review schedule management. See Pentaho Business Analytics.
Configure BA design tools
Do this only on a development system. Do not configure design tools on your production server.
Configure design tools and utilities. See Install Pentaho Data Integration and Analytics.
Configure each tool’s data connections. See Install Pentaho Data Integration and Analytics.
Import data sources and data models
Create data sources and models that support agile BA development.
Choose data source types
Choose a data source type. See Pentaho Business Analytics.
Review relational versus multidimensional models.
Create data sources and models
Tour the Data Source Wizard. See Pentaho Business Analytics.
Learn how the server and wizard produce usable sources and models.
Create database table data sources
Create a database table source. See Pentaho Business Analytics.
Create initial data sources and preliminary models.
Learn about Mondrian schemas
Create and modify Mondrian schemas. See Pentaho Schema Workbench.
Add a Mondrian data source.
Adapt the schema for Analyzer.
Refine the schema in Schema Workbench.
Create reports and further refine data models
Work with data scientists and business analysts at this stage. This improves the quality of models and reports.
As you prepare to move to production, use data sources from:
Pentaho Schema Workbench
Pentaho Metadata Editor
Create Analyzer reports, Interactive reports, and dashboards
Follow Pentaho Business Analytics instructions.
Create Interactive and Analyzer reports.
Create a dashboard.
Verify results match what you need.
If needed, refine models with your data team.
Create a report with Report Designer (optional)
Follow Pentaho Report Designer instructions.
Refine your data source model
Edit multidimensional models. See Pentaho Business Analytics.
Refine Mondrian schemas. See Pentaho Schema Workbench.
Refine relational models. See Pentaho Metadata Editor.
Recreate reports to validate changes.
Repeat until results meet requirements.
Test environment quality
If you do quality assurance testing, upload content to the Pentaho Repository. Then download it to the QA server. See Administer Pentaho Data Integration and Analytics for details.
Some organizations also run user acceptance testing after QA.
Next steps (development)
Investigate security. See Administer Pentaho Data Integration and Analytics.
Plan scheduling for production. See Pentaho Business Analytics.
Decide what content to promote to production. See Administer Pentaho Data Integration and Analytics.
Check the Support Portal for service packs.
Prepare to Go live for production - BA.
Go live for production - BA
This section explains how to move Pentaho content and server settings between servers.
This process usually uses two or three servers with identical configurations:
BA content development
Testing and QA (optional)
Production
We recommend working with Pentaho professional services during production deployment.
Prepare for going live
This section has two parts:
A checklist for setting up a Pentaho Server
Prerequisites to complete before you go live
If your production server is already set up, start with the prerequisites.
Pentaho Server setup checklist
Verify system requirements
Consult: - Components Reference - JDBC drivers reference
- Acquire one or more servers that meet requirements. - Obtain the correct drivers.
Obtain software and install the Pentaho Server
- Install Pentaho Suite. See Install Pentaho Data Integration and Analytics. - Download and install the latest service pack. See Administer Pentaho Data Integration and Analytics. - Access the User Console. See Pentaho Business Analytics.
- Install the software. - Install the latest service pack. - Access the User Console, review Administration, and change the default administrator password. If needed, change the fully qualified URL for the Pentaho Server.
Change the server fully qualified URL
Change the Pentaho Server fully qualified URL if needed. See Administer Pentaho Data Integration and Analytics.
If many machines point to one server, change the URL and verify connectivity.
Configure the server
- Manage licenses. See Administer Pentaho Data Integration and Analytics. - Specify data connections. See Install Pentaho Data Integration and Analytics. - Set up email for scheduled reports. See Pentaho Business Analytics.
- Set up data connections. - Configure email through Administration.
Prerequisites before you go live
Compare configuration files
- Compare server configuration files. - Verify and increase memory settings. See Administer Pentaho Data Integration and Analytics.
- Identify configuration differences. - Commit a unified properties file to version control. - Increase memory settings as needed.
Verify data sources
- Specify data connections. See Install Pentaho Data Integration and Analytics. - Define JNDI connections. See Install Pentaho Data Integration and Analytics.
- Confirm data sources can be promoted. - Establish JNDI sources as replacements if needed.
Define security
- Define Pentaho Server security. See Administer Pentaho Data Integration and Analytics. - Manage users and roles. See Pentaho Business Analytics. - Implement advanced security. See Administer Pentaho Data Integration and Analytics.
- Implement security. - Define users, roles, and permissions.
Upload content
Upload and download from the Pentaho Repository. See Administer Pentaho Data Integration and Analytics.
- Upload files and folders.
Compare configuration files
The most important server configuration settings are stored in the /server/pentaho-server/pentaho-solutions/system/ directory.
Some core settings are also inside the Pentaho WAR archive deployed to your application server. These settings should not change after initial setup.
Do not change the names of content files, data sources, solution directories, or other file names during promotion.
Set names during solution development. Keep names consistent through promotion.
Renaming can cause issues that you will not detect immediately. This can break QA and production content.
To ensure you selected all server configuration files, compare these directories in full:
/pentaho-solutions/system//WEB-INF/inside your deployedpentaho.war/META-INF/inside your deployedpentaho.war
Plugin directories for Analyzer, Dashboard Designer, Interactive Reports, and Community Dashboard Framework include binaries.
Binary differences usually indicate version differences. Focus on XML and properties files.
If you customized plugins, promote those changes too.
Move content to production server
This checklist summarizes best practices to promote Pentaho Server settings, data sources, and content.
Before you promote from development to production, complete the preparation and prerequisite tasks earlier in this page.
Download content
- Upload and download from the Pentaho Repository.
- Move all desired content to production. - See Administer Pentaho Data Integration and Analytics for details.
Set up schedules and blockout times
- Manage schedules. - Prevent scheduling by setting blockout times.
- Set up production schedules. - Set up blockout times for maintenance. - See Pentaho Business Analytics for details.
Next steps (production)
These resources are helpful after your production server is live:
See Administer Pentaho Data Integration and Analytics for guidance on maintenance and tuning.
Pentaho Training and Education
Support Portal for service packs
Last updated
Was this helpful?

