Setting up the DI Operations Mart with a manual installation

Follow these instructions for setting up the Data Integration Operations Mart if you installed Pentaho with the manual installation method. See the Install Pentaho Data Integration and Analytics document for details on the manual installation method.

Before you begin

  • postgresql

  • mysql15

  • oracle10g or oracle12c

  • sqlserver

Note: In the file name, databasename is the name of your Pentaho Server Repository database type:

  • pentaho_logging_*databasename*.sql

  • pentaho_mart_*databasename*.sql

To install the Data Integration Operations Mart, you must have the following two scripts:

Data Integration Operations Mart scripts

  • PostgreSQL

  • MySQL

  • Oracle

  • MS SQL Server

Before proceeding with the Data Integration Operations Mart installation steps, ensure that your Pentaho Server and Pentaho Repository are configured with one of the following database types:

Required database

These Data Integration Operations Mart installation instructions assume that you have installed the Pentaho Server with the manual installation method. If you need to review the installation instructions, see the Install Pentaho Data Integration and Analytics document.

Manual installation of the Pentaho Server

Installation of the Data Integration Operations Mart depends on several conditions and prerequisites.

Process overview

To install the Data Integration Operations Mart, perform the following steps:

Step 1: Get the Data Integration Operations Mart files

Download Data Integration Operation Mart files from the Support Portal to obtain the scripts for creating tables that log activity for transformations and jobs.

  1. Download the pentaho-operations-mart-9.3.0.0-<build number>.zip file from the Support Portal.

    1. On the Support Portal home page, sign in using the Pentaho support username and password provided in your Pentaho Welcome Packet.

    2. In the Pentaho card, click Download.

      The Downloads page opens.

    3. In the 9.x list, click See all <number> articles to see the full list of 9.x downloads.

    4. On the 9.x page, click Pentaho 9.3 GA Release.

    5. Scroll to the bottom of the Pentaho 9.3 GA Release page.

    6. In the file component section, navigate to the Operations Mart folder.

    7. Download the pentaho-operations-mart-9.3.0.0-<build number>-dist.zip file.

  2. Unpack the pentaho-operations-mart-9.3.0.0-<build number>-dist.zip file to a temporary directory.

    The pentaho-operations-mart-9.3.0.0-<build number> directory is created in the temporary directory.

  3. In the pentaho-operations-mart-9.3.0.0-<build number> directory, run one of the following scripts that is appropriate for the OS:

    • install.bat

    • install.sh

    • installer.jar

  4. In the IZPack window, read the license agreement, select I accept the terms of this license agreement, and then click Next.

  5. In the Select the installation path text box, browse to or enter the directory location where you want to unpack the files, then click Next.

  6. If you chose an existing directory, a warning message that the directory already exists appears. Click Yes to continue.

    Any existing files in the directory are retained.

  7. When the installation progress is complete, click Quit.

  8. Navigate to the directory where you unpacked the pentaho-operations-mart-9.3.0.0-dist.zip file.

  9. Locate the pentaho-operations-mart-ddl-9.3.0.0-<build number>-dist.zip file.

  10. Move the pentaho-operations-mart-ddl-9.3.0.0-<build number>-dist.zip file to the following directory and unpack the file.

    <install directory>/pentaho-server/data/

    Unpacking the file writes two required scripts into the repository database directory.

    The following table lists the directories for each type of repository database and the two installation scripts that are unpacked into the directory.

    Note: The directory also contains update scripts that you do not need for the installation process.

Directory
Installation scripts

/postgresql

pentaho_mart_postgresql.sql``pentaho_logging_postgresql.sql

/mysql5

pentaho_mart_mysql.sql``pentaho_logging_mysql.sql

/oracle (10g or 12c)

pentaho_mart_oracle.sql``pentaho_logging_oracle.sql

/sqlserver

pentaho_mart_sqlserver.sql``pentaho_logging_sqlserver.sql

The following files, which contain Operations Mart sample content, are included in the respective directories. Operations Mart sample content is needed for [Step 6: Add the Data Integration Operations Mart ETL solutions to the Pentaho Repository default content folder](Step%206%20Add%20the%20ETL%20solution%20and%20sample%20reports%20(DI%20Ops%20Mart%20-%20Manual).md).

-   `pentaho-operations-mart-operations-di-9.3.0.0.zip`
-   `pentaho-operations-mart-operations-bi-9.3.0.0.zip`
The files that are required for each type of repository database are also included in the directory. The following table lists the files that are required for each type of repository database.
Database Type
Required files

Oracle

pentaho-operations-mart-etl-oracle10g-<version>.zip``pentaho-operations-mart-clean-oracle10g-<version>.zip

MySQL5

pentaho-operations-mart-etl-mysql5-<version>.zip``pentaho-operations-mart-clean-mysql5-<version>.zip

MS SQL Server

pentaho-operations-mart-etl-mssql-<version>.zip``pentaho-operations-mart-clean-mssql-<version>.zip

Step 2: Run the setup scripts

Depending on your database repository type, run each of the scripts listed below in the order shown. These scripts create the tables that log the activity for transformations and jobs.

  1. pentaho_logging_*databasename*.sql

  2. pentaho_mart_*databasename*.sql

Note: In the file name, databasename is the name of your Pentaho Server Repository database type:

  • postgresql

  • mysql15

  • oracle10g or oracle12c

  • sqlserver

Step 3: Set the global Kettle logging variables

Perform this step on the computer where you have installed your Pentaho Data Integration (PDI) client and the Pentaho Server.

When you run PDI for the first time, the kettle.properties file is created and stored in the $USER_HOME/.kettle.properties directory.

  1. In the PDI client, select Edit > Edit the kettle.properties file.

  2. Add or edit the values for each of the logging variables shown in the following log tables:

    Note: If you customized the values for these logging variables in the following scripts, add the customized values for your site rather than the default values shown in the table.

    • pentaho_logging_*databasename*.sql

    • pentaho_mart_*databasename*.sql where databasename is your database type.

    Note: For Oracle and Microsoft SQL Server, leave Value blank with Variables that contain 'SCHEMA' in the name.

    Variable
    Value

    KETTLE_CHANNEL_LOG_DB

    live_logging_info

    KETTLE_CHANNEL_LOG_TABLE

    channel_logs

    KETTLE_CHANNEL_LOG_SCHEMA

    pentaho_dilogs

    Variable
    Value

    KETTLE_JOBENTRY_LOG_DB

    live_logging_info

    KETTLE_JOBENTRY_LOG_TABLE

    jobentry_logs

    KETTLE_JOBENTRY_LOG_SCHEMA

    pentaho_dilogs

    Variable
    Value

    KETTLE_JOB_LOG_DB

    live_logging_info

    KETTLE_JOB_LOG_TABLE

    job_logs

    KETTLE_JOB_LOG_SCHEMA

    pentaho_dilogs

    Variable
    Value

    KETTLE_METRICS_LOG_DB

    live_logging_info

    KETTLE_METRICS_LOG_TABLE

    metrics_logs

    KETTLE_METRICS_LOG_SCHEMA

    pentaho_dilogs

Step 4: Add the JNDI connections for logging

This section explains how to add the logging (live_logging_info) and Operations Mart (PDI_Operations_Mart) connections for a PDI client.

  1. Navigate to the PDI client <install directory>/data-integration/simple-jndi directory.

  2. Open the jdbc.properties file with a text editor.

  3. Depending on your repository database type, update the values accordingly (URL, users, password) as shown in the following samples.

    PostgreSQL:

     PDI_Operations_Mart/type=javax.sql.DataSource
    						PDI_Operations_Mart/driver=org.postgresql.Driver
    						PDI_Operations_Mart/url=jdbc:postgresql://localhost:5432/hibernate?searchpath=pentaho_operations_mart
    						PDI_Operations_Mart/user=hibuser
    						PDI_Operations_Mart/password=password
    						live_logging_info/type=javax.sql.DataSource
    						live_logging_info/driver=org.postgresql.Driver
    						live_logging_info/url=jdbc:postgresql://localhost:5432/hibernate?searchpath=pentaho_dilogs
    						live_logging_info/user=hibuser
    						live_logging_info/password=password

    MySQL:

    PDI_Operations_Mart/type=javax.sql.DataSource
    						PDI_Operations_Mart/driver=com.mysql.jdbc.Driver
    						PDI_Operations_Mart/url=jdbc:mysql://localhost:3306/pentaho_operations_mart
    						PDI_Operations_Mart/user=hibuser
    						PDI_Operations_Mart/password=password
    						live_logging_info/type=javax.sql.DataSource
    						live_logging_info/driver=com.mysql.jdbc.Driver
    						live_logging_info/url=jdbc:mysql://localhost:3306/pentaho_dilogs
    						live_logging_info/user=hibuser
    						live_logging_info/password=password

    Oracle:

    PDI_Operations_Mart/type=javax.sql.DataSource
    						PDI_Operations_Mart/driver=oracle.jdbc.OracleDriver
    						PDI_Operations_Mart/url=jdbc:oracle:thin:@localhost:1521/XE
    						PDI_Operations_Mart/user=pentaho_operations_mart
    						PDI_Operations_Mart/password=password
    						live_logging_info/type=javax.sql.DataSource
    						live_logging_info/driver=oracle.jdbc.OracleDriver
    						live_logging_info/url=jdbc:oracle:thin:@localhost:1521/XE
    						live_logging_info/user=pentaho_dilogs
    						live_logging_info/password=password

    Microsoft SQL Server:

    PDI_Operations_Mart/type=javax.sql.DataSource
    						PDI_Operations_Mart/driver=com.microsoft.sqlserver.jdbc.SQLServerDriver
    						PDI_Operations_Mart/url=jdbc:sqlserver://10.0.2.15:1433;DatabaseName=pentaho_operations_mart
    						PDI_Operations_Mart/user=pentaho_operations_mart
    						PDI_Operations_Mart/password=password
    						live_logging_info/type=javax.sql.DataSource
    						live_logging_info/driver=com.microsoft.sqlserver.jdbc.SQLServerDriver
    						live_logging_info/url=jdbc:sqlserver://10.0.2.15:1433;DatabaseName=pentaho_dilogs
    						live_logging_info/user=dilogs_user						live_logging_info/password=password

Step 5: Add a JDBC connection for the Pentaho Server

Add a JDBC connection for the Pentaho Server on the machine where you have installed the Pentaho Server.

Complete the following steps to add a JDBC connection:

  1. Navigate to the <install directory>/server/pentaho-server/tomcat/webapps/Pentaho/META-INF/ folder.

  2. Open the context.xml file with a text editor.

    Note: If the context.xml file is missing, you must locate and unpack the pentaho.war file that contains the context.xml file, and then move the context.xml file into the META-INF folder. The pentaho.war file is in one of the following locations:

    1. If the Pentaho Server is deployed on Linux, the pentaho.war file is in one of the following directories:

      • Tomcat: pentaho/server/pentaho-server/<your tomcat installation directory>/webapps

      • JBoss: pentaho/server/pentaho-server/<your jboss installation directory>/standalone/deployments

    2. If the Pentaho Server is deployed on Windows, the pentaho.war file is in one of the following folders:

      • Tomcat: pentaho\server\pentaho-server\<your tomcat installation directory>/webapps

      • JBoss: pentaho\server\pentaho-server\<your jboss installation directory>\standalone\deployments

  3. Depending on your database type, edit the file to reflect the values, as shown in the following examples.

  4. (Optional) If you plan to use encrypted passwords, locate all occurrences of the factory setting and replace them with the following value:

    factory="org.pentaho.di.core.database.util.DecryptingDataSourceFactory

    PostgreSQL:

     <Resource name="jdbc/PDI_Operations_Mart" auth="Container" type="javax.sql.DataSource"
    						factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxActive="20" minIdle="0" maxIdle="5" initialSize="0"
    						maxWait="10000" username="hibuser" password="password"
    						driverClassName="org.postgresql.Driver" 
    						url="jdbc:postgresql://localhost:5432/hibernate"
    						validationQuery="select 1"/>
    						
    						<Resource name="jdbc/pentaho_operations_mart" auth="Container" type="javax.sql.DataSource"
    						factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxActive="20" minIdle="0" maxIdle="5" initialSize="0"
    						maxWait="10000" username="hibuser" password="password"
    						driverClassName="org.postgresql.Driver" 
    						url="jdbc:postgresql://localhost:5432/hibernate"
    						validationQuery="select 1"/>
    						
    						<Resource name="jdbc/live_logging_info" auth="Container" type="javax.sql.DataSource"
    						factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxActive="20" minIdle="0" maxIdle="5" initialSize="0"
    						maxWait="10000" username="hibuser" password="password"
    						driverClassName="org.postgresql.Driver" 
    						url="jdbc:postgresql://localhost:5432/hibernate?searchpath=pentaho_dilogs"            
    						validationQuery="select 1"/>

    MySQL:

      <Resource name="jdbc/PDI_Operations_Mart" auth="Container" type="javax.sql.DataSource"
    						factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxActive="20" maxIdle="5"
    						maxWait="10000" username="hibuser" password="password"
    						driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/pentaho_operations_mart"
    						jdbcInterceptors="ConnectionState" defaultAutoCommit="true" validationQuery="select 1"/>
    						
    						<Resource name="jdbc/pentaho_operations_mart" auth="Container" type="javax.sql.DataSource"
    						factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxActive="20" maxIdle="5"
    						maxWait="10000" username="hibuser" password="password"
    						driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/pentaho_operations_mart"
    						jdbcInterceptors="ConnectionState" defaultAutoCommit="true" validationQuery="select 1"/>
    						
    						<Resource name="jdbc/live_logging_info" auth="Container" type="javax.sql.DataSource"
    						factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxActive="20" maxIdle="5"
    						maxWait="10000" username="hibuser" password="password"
    						driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/pentaho_dilogs"            
    						jdbcInterceptors="ConnectionState" defaultAutoCommit="true" validationQuery="select 1"/>

    Oracle:

    <Resource 
    						validationQuery="select 1 from dual"
    						url="jdbc:oracle:thin:@localhost:1521/orcl"
    						driverClassName="oracle.jdbc.OracleDriver"
    						password="password"
    						username="pentaho_operations_mart"
    						initialSize="0"
    						maxActive="20"
    						maxIdle="10"
    						maxWait="10000"
    						factory="org.apache.tomcat.jdbc.pool.DataSourceFactory"
    						type="javax.sql.DataSource"
    						auth="Container"
    						connectionProperties="oracle.jdbc.J2EE13Compliant=true"
    						name="jdbc/pentaho_operations_mart"/>
    						
    						<Resource 
    						validationQuery="select 1 from dual"
    						url="jdbc:oracle:thin:@localhost:1521/orcl"
    						driverClassName="oracle.jdbc.OracleDriver"
    						password="password"
    						username="pentaho_operations_mart"
    						initialSize="0"
    						maxActive="20"
    						maxIdle="10"
    						maxWait="10000"
    						factory="org.apache.tomcat.jdbc.pool.DataSourceFactory"
    						type="javax.sql.DataSource"
    						auth="Container"
    						connectionProperties="oracle.jdbc.J2EE13Compliant=true"
    						name="jdbc/PDI_Operations_Mart"/>
    						
    						<Resource validationQuery="select 1 from dual" url="jdbc:oracle:thin:@localhost:1521/XE" 
    						driverClassName="oracle.jdbc.OracleDriver" password="password" 
    						username="pentaho_dilogs" maxWaitMillis="10000" maxIdle="5" maxTotal="20" 
    						jdbcInterceptors="ConnectionState" defaultAutoCommit="true" 
    						factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" type="javax.sql.DataSource" 
    						auth="Container" name="jdbc/live_logging_info"/>

    Microsoft SQL Server:

    <Resource name="jdbc/PDI_Operations_Mart" auth="Container" type="javax.sql.DataSource"
    						factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxTotal="20" maxIdle="5"
    						maxWaitMillis="10000" username="pentaho_operations_mart" password="password" 
    						jdbcInterceptors="ConnectionState" defaultAutoCommit="true"
    						driverClassName="com.microsoft.sqlserver.jdbc.SQLServerDriver" 
    						url="jdbc:sqlserver://localhost:1433;DatabaseName=pentaho_operations_mart"
    						validationQuery="select 1"/>
    						
    						<Resource name="jdbc/pentaho_operations_mart" auth="Container" type="javax.sql.DataSource"
    						factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxTotal="20" maxIdle="5"
    						maxWaitMillis="10000" username="pentaho_operations_mart" password="password" 
    						jdbcInterceptors="ConnectionState" defaultAutoCommit="true"
    						driverClassName="com.microsoft.sqlserver.jdbc.SQLServerDriver" 
    						url="jdbc:sqlserver://localhost:1433;DatabaseName=pentaho_operations_mart"
    						validationQuery="select 1"/>
    						
    						<Resource name="jdbc/live_logging_info" auth="Container" type="javax.sql.DataSource"
    						factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxTotal="20" maxIdle="5"
    						maxWaitMillis="10000" username="dilogs_user" password="password" 
    						jdbcInterceptors="ConnectionState" defaultAutoCommit="true"
    						driverClassName="com.microsoft.sqlserver.jdbc.SQLServerDriver" 
    						url="jdbc:sqlserver://localhost:1433;DatabaseName=pentaho_dilogs"            
    						validationQuery="select 1"/>

Step 6: Add the Data Integration Operations Mart ETL solutions to the Pentaho Repository default content folder

If you are using PostgreSQL as your repository database, you can skip to Step 7: Initialize the Data Integration Operations Mart.

  1. Stop the Pentaho Server.

  2. Depending on your repository database type, locate the ZIP files containing the ETL solution and sample reports.

    You downloaded and unpacked these files in Step 1: Get the Data Integration Operations Mart files.

    • pentaho-operations-mart-operations-di-9.3.0.zip

    • pentaho-operations-mart-operations-bi-9.3.0.zip Additionally, locate the two ZIP files that are specific to your repository type:

Directory

File names

/oracle (10g or 12c)

  • pentaho-operations-mart-etl-oracle10g-9.3.0.zip

  • pentaho-operations-mart-clean-oracle10g-9.3.0.zip

/mysql5

  • pentaho-operations-mart-etl-mysql5-9.3.0.zip

  • pentaho-operations-mart-clean-mysql5-9.3.0.zip

/sqlserver

  • pentaho-operations-mart-etl-mssql-9.3.0.zip

  • pentaho-operations-mart-clean-mssql-9.3.0.zip

3. Copy all four ZIP files \(di, bi, mart-etl, mart-clean\) for your database to this directory:

`$PENTAHO_HOME/pentaho-server/pentaho-solution/system/default-content`

-   DI Operations Mart sample reports: pentaho-operations-mart-operations-di-9.3.0-dist.zip
-   BA Operations Mart sample reports: pentaho-operations-mart-operations-bi-9.3.0-dist.zip

4. Start the Pentaho Server.

When you restart the Pentaho Server, the startup process unpacks the content in the ZIP files to generate the Pentaho User Console \(PUC\) reports, sample transformations, and sample jobs needed to use the Data Integration Operations Mart.

**Note:** After these files are processed by the Pentaho Server, they are renamed with a timestamp so that each subsequent time you start the Pentaho Server, it does not unzip them again. You must keep these files in this directory, even though the date/timestamp is the installation date.

Step 7: Initialize the Data Integration Operations Mart

Perform these steps for the Operations Mart to start creating and collecting log file content.

  1. Launch the PDI client (Spoon).

  2. Connect to the Pentaho Repository through the Pentaho Server.

  3. At the main menu, select File > Open.

  4. Select Browse Files > Public > Pentaho Operations Mart > DI Ops Mart ETL.

    List of DI Ops Mart files in PDI Client folder
  5. To initiate the transformation and job logging processes, open each transformation and job.

  6. In each transformation or job, open the associated Job Properties or Transformation Properties window and click the Logging tab.

    • For logging to occur, you must at a minimum add a value to each individual Log Connection field shown in the tables below for jobs and transformations. A best practice tip for these fields is to use the global variables, as shown in the tables. You can also use the values you customized for your site and defined in the kettle.properties file during Step 3: Set the global Kettle logging variables.

    • If you leave all three fields shown in each table as empty values, then no logging occurs.

    • Logging also occurs if you add a value to all three fields, as shown in each table.

    • For job logging, add values to the Log connection, Log table schema, and Log table name fields as shown in the tables below for the Job log table, Job entry log table, and Logging channel log table in the Log tab. You can also use any values you have customized for your site.

      Job properties window, Logging tab
      Field
      Value

      Log connection

      ${KETTLE_JOB_LOG_DB}

      Log table schema

      ${KETTLE_JOB_LOG_SCHEMA}

      Log table name

      ${KETTLE_JOB_LOG_TABLE}

      Field
      Value

      Log connection

      ${KETTLE_JOBENTRY_LOG_DB}

      Log table schema

      ${KETTLE_JOBENTRY_LOG_SCHEMA}

      Log table name

      ${KETTLE_JOBENTRY_LOG_TABLE}

      Field
      Value

      Log connection

      ${KETTLE_CHANNEL_LOG_DB}

      Log table schema

      ${KETTLE_CHANNEL_LOG_SCHEMA}

      Log table name

      ${KETTLE_CHANNEL_LOG_TABLE}

    • For transformation logging, add values to the Log connection, Log table schema, and Log table name fields as shown in the following tables for the Transformation, Step, Performance, Logging channels, and Metrics in the Logging tab. You can also use a value you have customized for your site.

      Transformation properties window, Logging tab
      Field
      Value

      Log connection

      ${KETTLE_TRANS_LOG_DB}

      Log table schema

      ${KETTLE_TRANS_LOG_SCHEMA}

      Log table name

      ${KETTLE_TRANS_LOG_TABLE}

      Field
      Value

      Log connection

      ${KETTLE_STEP_LOG_DB}

      Log table schema

      ${KETTLE_STEP_LOG_SCHEMA}

      Log table name

      ${KETTLE_STEP_LOG_TABLE}

      Field
      Value

      Log connection

      ${KETTLE_TRANS_PERFORMANCE_LOG_DB}

      Log table schema

      ${KETTLE_TRANS_PERFORMANCE_LOG_SCHEMA}

      Log table name

      ${KETTLE_TRANS_PERFORMANCE_LOG_TABLE}

      Field
      Value

      Log connection

      ${KETTLE_CHANNEL _LOG_DB}

      Log table schema

      ${KETTLE_CHANNEL_LOG_SCHEMA}

      Log table name

      ${KETTLE_CHANNEL_LOG_TABLE}

      Field
      Value

      Log connection

      ${KETTLE_METRICS_LOG_DB}

      Log table schema

      ${KETTLE_METRICS_LOG_SCHEMA}

      Log table name

      ${KETTLE_METRICS_LOG_TABLE}

  7. In the main menu, select File > Open.

  8. Select Browse Files > Public > Pentaho Operations Mart > DI Ops Mart ETL > Fill_in_DIM_DATE_and_DIM_TIME job file and run it.

  9. Run a few of the sample KTRs, to generate logging activities for Step 8: Verify the Data Integration Operations Mart is working. You can also use or create your own sample KTRs.

  10. At the main menu, select File > Open.

  11. Select Public Pentaho Operations Mart DI Ops Mart ETL > Update_Dimensions_then_Logging_Data job file and run it.

    All the transformations and jobs are placed in the Pentaho Repository, then the data mart is populated. You can then set your transformations and jobs to run on a schedule, based on how often you want this data refreshed.

Step 8: Verify the Data Integration Operations Mart is working

  1. In the PDI client, select Browse Files > Public > Pentaho Operations Mart > DI Audit Reports > Last_Runand open it.

  2. Check that you see the jobs and transformations that were run in Step 7: Initialize the Data Integration Operations Mart.

Last updated

Was this helpful?