Google BigQuery
You can use Google BigQuery as a data source with the Pentaho User Console or with the PDI client.
Before you begin, you must have a Google account and must create service account credentials in the form of a key file in JSON format to connect to Google BigQuery. To create service account credentials, see the Google Cloud Storage Authentication documentation.
Additionally, you must set permissions for your BigQuery and Google Cloud accounts. To configure your service account authentication, see the Google Service Account documentation.
Perform the following steps to create a JDBC connection to a Google BigQuery data source from the User Console or PDI client.
Stop the Pentaho Server.
Download the ZIP file containing the Simba version 1.5.4.1008 JDBC 4.2 driver for Google BigQuery from https://storage.googleapis.com/simba-bq-release/jdbc/SimbaJDBCDriverforGoogleBigQuery42_1.2.2.1004.zip.
Navigate to the
server/pentaho-server/tomcat/webapps/pentaho/WEB-INF/lib
directory for the User Console or thedesign-tools/data-integration/lib
directory for the PDI client and delete any files associated with previous versions of Google BigQuery.Visually verify each file to ensure the older version is deleted.
Extract the following files to the
server/pentaho-server/tomcat/webapps/pentaho/WEB-INF/lib
folder for the User Console or thedesign-tools/data-integration/lib
directory for the PDI client.animal-sniffer-annotations-1.14.jar
api-common-1.7.0.jar
avro-1.9.0.jar
checker-compat-qual-2.5.2.jar
error_prone_annotations-2.1.3.jar
gax-1.42.0.jar
gax-grpc-1.42.0.jar
google-api-client-1.28.0.jar
google-api-services-bigquery-v2-rev426-1.25.0.jar
google-auth-library-credentials-0.15.0.jar
google-auth-library-oauth2-http-0.13.0.jar
GoogleBigQueryJDBC42.jar
google-cloud-bigquerystorage-0.85.0-alpha.jar
google-cloud-core-1.67.0.jar
google-cloud-core-grpc-1.67.0.jar
google-http-client-1.29.0.jar
google-http-client-apache-2.0.0.jar
google-http-client-jackson2-1.28.0.jar
google-oauth-client-1.28.0.jar
grpc-alts-1.18.0.jar
grpc-auth-1.18.0.jar
grpc-context-1.18.0.jar
grpc-core-1.18.0.jar
grpc-google-cloud-bigquerystorage-v1beta1-0.50.0.jar
grpc-grpclb-1.18.0.jar
grpc-netty-shaded-1.18.0.jar
grpc-protobuf-1.18.0.jar
grpc-protobuf-lite-1.18.0.jar
grpc-stub-1.18.0.jar
gson-2.7.jar
j2objc-annotations-1.1.jar
javax.annotation-api-1.3.2.jar
jsr305-3.0.2.jar
opencensus-api-0.18.0.jar
opencensus-contrib-grpc-metrics-0.18.0.jar
opencensus-contrib-http-util-0.18.0.jar
protobuf-java-3.7.0.jar
protobuf-java-util-3.7.0.jar
proto-google-cloud-bigquerystorage-v1beta1-0.50.0.jar
proto-google-common-protos-1.15.0.jar
proto-google-iam-v1-0.12.0.jar
threetenbp-1.3.3.jar
Note: The Google BigQuery connection name does not display in the User Console Database Connection dialog box until you copy these files.
Restart the Pentaho Server.
Log on to the User Console or the PDI client, then open the Database Connection dialog box.
See the Install Pentaho Data Integration and Analytics document for more information on the Database Connection dialog box.
In the Database Connection dialog box, select General, then select Google BigQuery as the Database Type.
In the Settings area, enter the information for your Google BigQuery account.
The Host Name is the URL to Google's BigQuery web services API. For example, https://www.googleapis.com/bigquery/v2
The Project ID in the PDI client and the Database name in the User Console are identical.
The Port Number is
443
.
Click Options, then add the following parameters and values.
ParameterValueOAuthType
0
(Zero)OAuthServiceAcctEmail
Specify your service account email address.
OAuthPvtKeyPath
Specify the path to your private key credential file.
Timeout
Specify the amount of time, in seconds, before the server closes the connection. The recommended value is 120 seconds.
Click Test to verify that you can connect to your data.
Last updated
Was this helpful?