Big Data Sources: Details
This table shows the Big Data sources that are compatible with specific Pentaho tools.
Data Source
Versions
Analyzer
PIR/PDD
Pentaho Reporting
DSW
PDIServer/Client
PRD
PSW
PME
Amazon EMR
7.0.0e (Certified)
No
No
No
No
Yes
Yes
No
No
Apache Vanilla Hadoop
3.3.0 (Certified)
No
No
No
Yes
Yes
No
No
No
Cassandra (Datastax)
6.8 (Certified)
No
No
No
No
Yes
No
No
No
Cloudera Data Platform (CDP) Private Cloud
7.1.9 (for job execution)
No
No
No
No
Yes
Yes
No
Yes
via Hive3a (as data source)
No
Yes
Yes
Yes
Yes
Yes
No
Yes
Google Dataprocc (for job execution)
2.1d
No
No
No
No
Yes
Yes
No
No
via Hive2 and Google BigQuery (as data source)
Yes
Yes
Yes
Yes
Yes
Yes
No
Yes
Greenplum
4.3
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Microsoft Azure HDInsight
4.0
Yes
Yes
No
No
Yes
No
No
Yes
MongoDB
7
No
No
Yes
No
Yes
Yes
No
No
Vertica
11
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Notes: A generic Apache Hadoop driver is included in the Pentaho distribution for version 10.2: Other supported drivers can be downloaded from the Support Portal.a Hive3 as a data source for CDP also supports Hive LLAP, and Hive3 on Tez.
b The Simba driver required for Google BigQuery is the JDBC 4.2-compatible version, which you can download from https://storage.googleapis.com/simba-bq-release/jdbc/SimbaJDBCDriverforGoogleBigQuery42_1.2.2.1004.zip.
c HBase is not supported with Google Dataproc.
d Use the Google Dataproc 2.1 driver for your Google Dataproc 2.2 cluster. The Google Dataproc 2.1 driver is certified to work for Google Dataproc 2.2.
e EMR clusters (version 7.x and later) built with JDK 17 exclude the commons-lang-2.6.jar
library from their standard Hadoop library directories ($HADOOP_HOME/lib
). To use the EMR driver for EMR 7.x, obtain the commons-lang-2.6.jar
file from a trusted source, such as the official Maven repository (Maven Repository: commons-lang » commons-lang » 2.6). Then manually copy the downloaded JAR file to the $HADOOP_HOME/lib
or $HADOOP_MAPRED_HOME/lib
directory on each node within the EMR cluster to ensure that all worker nodes have access to the library.
Last updated
Was this helpful?