Components reference
Pentaho aims to accommodate diverse computing environments. This list provides details about the environment components and versions we support. Where applicable, versions are listed as certified or supported:
Certified
The version has been tested and validated for compatibility with Pentaho.
Supported
Support is available for listed non-certified versions.
If you have questions about your particular computing environment, contact Pentaho Support.
Server operating system
The Pentaho Server is hardware-independent and runs on server-class computers.
Your server-class computer must comply with the specifications for minimum hardware and required operating systems:
Processor
Intel EM64T or AMD64 Dual-Core or later
RAM
8 GB with 4 GB dedicated to Pentaho servers
Disk Space
20 GB free after installation
Microsoft Windows 2025 Server
Red Hat Enterprise 9*
Ubuntu Server 22.04 LTS
Microsoft Windows 2022 Server
Red Hat Enterprise 8*
*Pentaho Data Integration and Analytics is supported on any Linux distribution binary-compatible with RHEL 9 and Ubuntu Server 22, including in virtualized and cloud environments. If you have any questions, contact Pentaho Support.
Note: Mac servers are not supported as an operating system.
Container deployment
Supported technology for deploying Pentaho in containers.
Docker
27.5.1
Note: Kubernetes environments that use this Docker version are also supported.
You can also deploy pre-configured Docker images of specific Pentaho products in your AWS environments. See Docker container deployment of Pentaho Server and Docker container deployment of Carte, Pan, and Kitchen for details.
Workstation operating system
These Pentaho design tools are hardware-independent and run on client-class computers that comply with these specifications for minimum hardware and required operation systems.
Pentaho Aggregation Designer
Pentaho Data Integration
Pentaho Metadata Editor
Pentaho Report Designer
Pentaho Schema Workbench
Processors
Apple Macintosh Dual-Core
Apple Mac M1, M2, and M3 chipset
Intel EM64T or AMD64 Dual-Core or later
RAM
2 GB RAM for most of the design tools, PDI requires 2 GB dedicated
Disk Space
2 GB free after installation
Minimum Screen Size
1280 x 960 pixels
Ubuntu Desktop 22.04
Microsoft Windows 11
macOS 15 (Sequioa)
macOS 14 (Sonoma)
Note: Ubuntu Linux requires `libwebkitgtk-1.0`. See Install Pentaho Data Integration and Analytics for more information.
Embedded software
When embedding Pentaho software into other applications, the computing environment should comply with these specifications for minimum hardware and required operation systems.
Embedded Pentaho Reporting
Embedded Pentaho Analysis
Embedded Pentaho Data Integration
Note: Pentaho Data Integration and Analytics is officially certified to run on the Red Hat Enterprise and Ubuntu Linux distributions. It is compatible with any binary-compatible Linux distribution that meets the necessary software and hardware requirements, including in virtualized and cloud environments. If you have any questions, contact Pentaho Support.The following specifications comply with minimum hardware and required operating systems for embedding Pentaho reporting, analysis, and data integration:
Processors
Intel EM64T or AMD64 Dual-Core
RAM
8 GB with 4 GB dedicated to Pentaho servers
Disk Space
20 GB free after installation
Microsoft Windows 2022 Server
Red Hat Enterprise 9
Ubuntu Server 22.04 LTS
Application servers
The server to which you deploy Pentaho software must run the following application server:
Tomcat 10.1.48 (Certified)
Solution database repositories
Pentaho software stores processing artifacts in these database repositories:
PostgreSQL
16
15
MySQL
8.4
Oracle
23c & 23ai
MS SQL Server
2022
2019 (including patched versions)
Maria DB
11.4
* The default installed solution database.
Apache Hadoop vendors
Pentaho software has certified or supported data sources from these Hadoop Vendors.
Amazon EMR
7.7.0
Cloudera Data Platform (CDP) Private Cloud
7.1.x, 7.3.1
Data Sources: Pentaho Tools
This table summarizes which data sources are compatible with the main Pentaho tools.
Pentaho Reporting
JDBC 3/41
ODBC
OLAP4J
XML
Pentaho Analysis
Pentaho Data Integration
Pentaho Metadata
Scriptable
Snowflake
Pentaho Server, Action Sequences
Relational (JDBC)
Hibernate
Javascript
Metadata (MQL)
Mondrian (MDX)
XML (XQuery)
Security User/Role List Provider
Snowflake
Data Integration Steps (PDI)
Other Action Sequences
Web Services
XMLA
Pentaho Data Integration
JDBC 3/41
OLAP4J
Salesforce
Snowflake
XML
CSV
Microsoft Excel
1 Use a JDBC 3.x or 4.x compliant driver that is compatible with SQL-92 standards when communicating with relational data sources. For a list of drivers to use with relational JDBC databases, see the JDBC drivers reference.
Big Data Sources: General
Pentaho software supports the following Big Data sources. Check this list if you are evaluating Pentaho or checking for general compatibility with a specific vendor.
Amazon EMR (via Hive)
7.7.0
Cassandra (Datastax)
6.8
Cloudera Data Platform (CDP) on premises (private cloud)
7.1.9
Google BigQuery (Simba)
1.6.2
1.2.25
MongoDB
7.0
Vertica*
24
* Deprecated beginning in version 11.0.
Big Data Sources: Details
This table shows the Big Data sources that are compatible with specific Pentaho tools.
Amazon EMR
7.7.01 (Certified)
No
No
No
No
Yes
Yes
No
No
Cassandra (Datastax)
6.8 (Certified)
No
No
No
No
Yes
No
No
No
Cloudera Data Platform (CDP) Private Cloud
7.1.9 (for job execution)
No
No
No
No
Yes
Yes
No
Yes
Cloudera Data Platform (CDP) Private Cloud
via Hive32 (as data source)
No
Yes
Yes
Yes
Yes
Yes
No
Yes
MongoDB
7
No
No
Yes
No
Yes
Yes
No
No
Vertica4
11
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
1 EMR clusters (version 7.x and later) built with JDK 17 exclude the commons-lang-2.6.jar library from their standard Hadoop library directories ($HADOOP_HOME/lib). To use the EMR driver for EMR 7.x, obtain the commons-lang-2.6.jar file from a trusted source, such as the official Maven repository (Maven Repository: commons-lang » commons-lang » 2.6). Then manually copy the downloaded JAR file to the $HADOOP_HOME/lib or $HADOOP_MAPRED_HOME/lib directory on each node within the EMR cluster to ensure that all worker nodes have access to the library.
2 Hive3 as a data source for CDP also supports Hive LLAP, and Hive3 on Tez.
3 The Simba driver required for Google BigQuery is the JDBC 4.2-compatible version, which you can download from https://storage.googleapis.com/simba-bq-release/jdbc/SimbaJDBCDriverforGoogleBigQuery42_1.2.2.1004.zip.
4 Deprecated beginning in version 11.0.
Note: A generic Apache Hadoop driver is included in the Pentaho distribution for version 10.2: Other supported drivers can be downloaded from the Support Portal.
SQL Dialect-Specific
Pentaho software generates dialect-specific SQL when communicating with these data sources. Certified indicates the SQL dialect has been tested for compatibility with Pentaho.
Pentaho Analyzer
Certified
Amazon Redshift
Azure SQL
Impala
MySQL
Microsoft SQL Server
Oracle
PostgreSQL
Snowflake
Supported
Access
Firebird
Hsqldb
IBM DB2
IBM MQ 9.2
Informix
Ingres1
Interbase1
Neoview1
SqlStream
Sybase1
Vectorwise1
Vertica1
Other SQL-89 compliant2
Pentaho Metadata
Certified
Azure SQL
Hive 2
Impala
MySQL
PostgreSQL
Supported
Amazon Redshift
ASSQL
Firebird
H2
Hypersonic
IBM DB2
IBM MQ 9.2
Ingres1
Interbase1
MS Access
MS SQL Server (JTDS Driver)
MS SQL Server (Microsoft Driver)
Snowflake
Sybase1
Vertica1
Other SQL-92 compliant2
Pentaho Data Integration
Certified
Amazon Redshift
Azure SQL
Hive1
Hive 2
Impala
MS SQL Server (JTDS Driver)
MS SQL Server (Microsoft Driver)
MySQL
Oracle
PostgreSQL
Snowflake
Vertica1
Supported
AS/400
InfiniDB1
Exasol 4
Firebird SQL
H2
Hypersonic
IBM DB2
IBM MQ 9.2
Informix
Ingres1
Ingres VectorWise1
MaxDB (SAP DB)
Neoview1
Oracle RDB
SQLite
UniVerse database
Other SQL-92 compliant2
1 Deprecated beginning in version 11.0.
2 If your data source is not in this list and is compatible with SQL-92, Pentaho software uses a generic SQL dialect.
Security
Pentaho software integrates with these third-party security authentication systems:
CAS 7 (Certified)
Integrated Windows Authentication with Internet Information Services 10 (Certified)
Spring 6.2.12 (Certified)
Java virtual machine
Pentaho software requirements for Java Runtime Environment (JRE).
All Pentaho software
Oracle Java 21
Oracle OpenJDK 21
Oracle Java 17
Oracle OpenJDK 17
Note: The PDI client requires at least Java 11.x to run on Windows 11.
Web browsers
Pentaho supports major versions of web browsers that are publicly available six weeks before the finalization of a Pentaho release.
The following browsers are certified:
Apple Safari 26.1 (On macOS only)
Google Chrome 142
Microsoft Edge 142
Mozilla Firefox 144
Support Statement for Analyzer on Impala
These are the minimum requirements for Analyzer to work with Impala:
Pentaho 7.1 or later
Impala 1.3.x or later
Recommend using Parquet compressed file format for tables in Impala
Make sure that the JDBC driver is dropped into the Pentaho Server and Schema Workbench directories. See the Install Pentaho Data Integration and Analytics document for details.
Turn off connection pooling in Pentaho Server.
In Mondrian schemas, divide dimension tables with high cardinality into several levels
Note: As with any data source, the performance of Pentaho Analyzer on Impala will be dependent upon the data shape, Impala’s configuration, and the types of queries. See the best practice, "Pentaho Analyzer with Impala as a Data Source" located at: https://support.pentaho.com/hc/en-us/articles/208652846 or download the PDF.
There are some compiled Mondrian automated test suite results for Analyzer on Impala with OEM Simba, as well as the community Apache Hive driver:
Google BigQuery
You can use Google BigQuery as a data source with the Pentaho User Console or with the PDI client.
Before you begin, you must have a Google account and must create service account credentials in the form of a key file in JSON format to connect to Google BigQuery. To create service account credentials, see the Google Cloud Storage Authentication documentation.
Additionally, you must set permissions for your BigQuery and Google Cloud accounts. To configure your service account authentication, see the Google Service Account documentation.
Perform the following steps to create a JDBC connection to a Google BigQuery data source from the User Console or PDI client.
Stop the Pentaho Server.
Download the ZIP file containing the Simba version 1.5.4.1008 JDBC 4.2 driver for Google BigQuery from https://storage.googleapis.com/simba-bq-release/jdbc/SimbaJDBCDriverforGoogleBigQuery42_1.2.2.1004.zip.
Navigate to the
server/pentaho-server/tomcat/webapps/pentaho/WEB-INF/libdirectory for the User Console or thedesign-tools/data-integration/libdirectory for the PDI client and delete any files associated with previous versions of Google BigQuery.Visually verify each file to ensure the older version is deleted.
Extract the following files to the
server/pentaho-server/tomcat/webapps/pentaho/WEB-INF/libfolder for the User Console or thedesign-tools/data-integration/libdirectory for the PDI client.animal-sniffer-annotations-1.14.jarapi-common-1.7.0.jaravro-1.9.0.jarchecker-compat-qual-2.5.2.jarerror_prone_annotations-2.1.3.jargax-1.42.0.jargax-grpc-1.42.0.jargoogle-api-client-1.28.0.jargoogle-api-services-bigquery-v2-rev426-1.25.0.jargoogle-auth-library-credentials-0.15.0.jargoogle-auth-library-oauth2-http-0.13.0.jarGoogleBigQueryJDBC42.jargoogle-cloud-bigquerystorage-0.85.0-alpha.jargoogle-cloud-core-1.67.0.jargoogle-cloud-core-grpc-1.67.0.jargoogle-http-client-1.29.0.jargoogle-http-client-apache-2.0.0.jargoogle-http-client-jackson2-1.28.0.jargoogle-oauth-client-1.28.0.jargrpc-alts-1.18.0.jargrpc-auth-1.18.0.jargrpc-context-1.18.0.jargrpc-core-1.18.0.jargrpc-google-cloud-bigquerystorage-v1beta1-0.50.0.jargrpc-grpclb-1.18.0.jargrpc-netty-shaded-1.18.0.jargrpc-protobuf-1.18.0.jargrpc-protobuf-lite-1.18.0.jargrpc-stub-1.18.0.jargson-2.7.jarj2objc-annotations-1.1.jarjavax.annotation-api-1.3.2.jarjsr305-3.0.2.jaropencensus-api-0.18.0.jaropencensus-contrib-grpc-metrics-0.18.0.jaropencensus-contrib-http-util-0.18.0.jarprotobuf-java-3.7.0.jarprotobuf-java-util-3.7.0.jarproto-google-cloud-bigquerystorage-v1beta1-0.50.0.jarproto-google-common-protos-1.15.0.jarproto-google-iam-v1-0.12.0.jarthreetenbp-1.3.3.jarNote: The Google BigQuery connection name does not display in the User Console Database Connection dialog box until you copy these files.
Restart the Pentaho Server.
Log on to the User Console or the PDI client, then open the Database Connection dialog box.
See the Install Pentaho Data Integration and Analytics document for more information on the Database Connection dialog box.
In the Database Connection dialog box, select General, then select Google BigQuery as the Database Type.
In the Settings area, enter the information for your Google BigQuery account.
The Host Name is the URL to Google's BigQuery web services API. For example, https://www.googleapis.com/bigquery/v2
The Project ID in the PDI client and the Database name in the User Console are identical.
The Port Number is
443.
Click Options, then add the following parameters and values.
ParameterValueOAuthType
0(Zero)OAuthServiceAcctEmail
Specify your service account email address.
OAuthPvtKeyPath
Specify the path to your private key credential file.
Timeout
Specify the amount of time, in seconds, before the server closes the connection. The recommended value is 120 seconds.
Click Test to verify that you can connect to your data.
Last updated
Was this helpful?

