Pentaho Operations Mart
This page is archived. See Performance monitoring.
Pentaho Operations Mart helps you monitor Pentaho Server performance.
It includes two marts:
Business Analytics (BA) Operations Mart. Aggregates Pentaho Server audit data.
Data Integration (DI) Operations Mart. Stores and summarizes PDI logging data.
Business Analytics Operations Mart
Pentaho can automatically set up Operations Mart content for Analyzer, Interactive Reports, Report Designer, and dashboards.
BA Operations Mart aggregates data from Pentaho Server log files into pre-built audit reports.
If the pre-built reports do not meet your needs, use the Data Integration Operations Mart to change them.
Download and install Operations Mart files
Operation Mart files are stored in a pre-packaged ZIP file.
To install the files, stop the Pentaho Server, download and unpack the ZIP file, then restart the Pentaho Server to import the files.
Stop the Pentaho Server.
See the Install Pentaho Data Integration and Analytics document for start/stop steps.
Download the
pentaho-operations-mart-11.0.0.0-<build number>.zipfile from the Support Portal.On the Support Portal home page, sign in using the Pentaho support credentials from your welcome packet.
In the Pentaho card, select Download.
In the 11.x list, select Pentaho 11.0 GA Release.
Scroll to the bottom of the Pentaho 11.0 GA Release page.
In the file component section, select the
Operations Martfolder.Download the
pentaho-operations-mart-11.0.0.0-<build number>.zipfile.
Unpack the ZIP file to a temporary directory.
Move the unpacked files to
pentaho/server/pentaho-server/pentaho-solutions/system/default-content.In
default-content, delete any files you do not need for your repository database type.The following table lists the files to keep for each type of Pentaho Repository database.
Repository database typeFiles to keepPostgresSQL
pentaho-operations-mart-clean-11.0.0.0-<build number>.zip*pentaho-operations-mart-etl-11.0.0.0-<build number>.zippentaho-operations-mart-operations-bi-11.0.0.0-<build number>.zip
MySQL
pentaho-operations-mart-clean-mysql5-11.0.0.0-<build number>.zip*pentaho-operations-mart-etl-mysql5-11.0.0.0-<build number>.zip
pentaho-operations-mart-operations-bi-11.0.0.0-<build number>.zip
Oracle
pentaho-operations-mart-clean-oracle10g-11.0.0.0-<build number>.zip*pentaho-operations-mart-etl-oracle10g-11.0.0.0-<build number>.zippentaho-operations-mart-operations-bi-11.0.0.0-<build number>.zip
MS SQL Server
pentaho-operations-mart-clean-mssql-11.0.0.0-<build number>.zip*pentaho-operations-mart-etl-mssql-11.0.0.0-<build number>.zippentaho-operations-mart-operations-bi-11.0.0.0-<build number>.zip
* Keep the
pentaho-operations-mart-clean-<database>-11.0.0.0-<build number>.zipfile only if you plan for Pentaho to automatically delete entries from the Operations Mart on a regular schedule. For details, see Clean up the BA Operations Mart.Restart the Pentaho Server.
The Operations Mart is ready to be used.
Increase the maximum character length in audit table fields
You can increase the maximum number of characters permitted in an audit table field from 200 to 1024 characters.
The scripts in this procedure adjust the column width and re-index the Operations Mart tables.
Stop the Pentaho Server.
See the Install Pentaho Data Integration and Analytics document for start/stop steps.
Download the
pentaho-server-ee-11.0.0.0-<build number>.zipfile from the Support Portal.On the Support Portal home page, sign in using the Pentaho support credentials from your welcome packet.
In the Pentaho card, select Download.
In the 11.x list, select Pentaho 11.0 GA Release.
Scroll to the bottom of the release page.
In the file component section, navigate to
Pentaho Server/Archive Build (Suggested Installation Method).Download the
pentaho-server-ee-11.0.0.0-<build number>.zipfile.
Unpack the ZIP file to a temporary directory.
In the temporary directory, navigate to
pentaho\server\pentaho-server\data.In
data, open the directory for your repository database type:mysqloraclepostgresqlsqlserver
Locate these files:
alter_script_<repository database type>_BISERVER-13674.sqlpentaho_mart_upgrade_audit_<repository database type>.sql
Use the SQL client tool for the repository type to run these scripts, in order:
alter_script_<repository database type>_BISERVER-13674.sqlpentaho_mart_upgrade_audit_<repository database type>.sql
Delete the temporary directory.
Start the Pentaho Server.
The maximum field length for audit tables is 1024 characters.
Choose a pre-built Operations Mart report
Pentaho provides pre-built Interactive Reports and a data mart called Pentaho Operations Mart.
The Operations Mart contains all the data from system log files.
Choose the report that fits your needs, then follow the steps in View and edit Operations Mart reports.
Amount of time it takes a report to run
Content Duration
File names of all content failures within a defined length of time
Content Failures
Compare login metrics for days within a specified month
Content Request Day of Month
Compare login metrics for a days in a week within a specified year
Day of Week Request and Login Metrics
List of content sorted by type, used within a defined length of time
Content Type Usage
List of content usage within a defined length of time
Content Usage
Compare login metrics by hour for a specified day
Hours in Day Request and Login Metrics
Length of time for logins and the number of logins per user for specified length of time
Session Duration Counts
View and edit Operations Mart reports
Identify the report you want to open in Choose a pre-built Operations Mart report.
Double-click Browse Files.
Select public > Pentaho Operations Mart > BA Audit Reports.
Select the file and click Open.
Optional: Edit the report in the tool you used to open it.
See the Pentaho Business Analytics document for editing in Analyzer or Interactive Reports.
See the Pentaho Report Designer document for editing in Report Designer.
Create Operations Mart reports
If the pre-built reports do not meet your needs, create your own reports.
In the User Console, select Create New.
Select the report type.
Select the data source you need.
Information shownData sourceDetail information related to the execution of the .xactions. that run the reports on the Pentaho Server
pentaho_operations_mart: BA Operations Mart - ComponentInformation related to the execution of content, such as which tool or which user ran the content on the Pentaho Server
pentaho_operations_mart: BA Operations Mart - ContentInformation for the Pentaho Server related to a user, such as number of sessions, how long, and what time
pentaho_operations_mart: BA Operations Mart - User SessionInformation about individual job entry executions on the Pentaho Server
pentaho_operations_mart: PDI Operations Mart - Job EntryDetailed performance information for the Pentaho Server
pentaho_operations_mart: - PDI Operations Mart - PerformanceDetailed information about individual step executions on the Pentaho Server
pentaho_operations_mart: PDI Operations Mart - StepInformation related to transformations and jobs run on the Pentaho Server
pentaho_operations_mart: PDI_Operations_Mart
Update the Operations Mart
You may need to update the report date and time, or the report data.
In the User Console, select Browse Files > public > Pentaho Operations Mart > Update Audit Mart.
Double-click Update Operations Mart Date & Time or Update BA Operations Mart Data.
View a report to confirm the updates.
Clean up the BA Operations Mart
The operational data in Operations Mart tables is set to automatically remove entries older than 365 days.
To change the schedule, add the ba.cleanup.max.age variable to kettle.properties in {user.home}/.kettle.
Stop the Pentaho Server.
Open
{user.home}/.kettle/kettle.properties.Add
ba.cleanup.max.age.Example. Delete entries older than 30 days:
Save the file.
Restart the Pentaho Server.
Data Integration Operations Mart
The Data Integration (DI) Operations Mart stores and summarizes PDI log data.
Use it to monitor jobs and transformations, and build reports and dashboards.
Set up the Data Integration Operations Mart
Setup depends on how you installed Pentaho. Use the instructions that match your installation method.
Follow these instructions if you installed Pentaho using the archive method. See the Install Pentaho Data Integration and Analytics document for archive installation.
Before you begin
Installation depends on these prerequisites.
Archive installation of the Pentaho Server
These steps assume a Pentaho Server archive installation. See the Install Pentaho Data Integration and Analytics document if needed.
Required database
Ensure your Pentaho Server and Pentaho Repository use one of these databases:
PostgreSQL
MySQL or MariaDB
Oracle
Microsoft SQL Server
Data Integration Operations Mart scripts
You need these scripts:
pentaho_logging_<databasename>.sqlpentaho_mart_<databasename>.sql
<databasename> matches your repository database type:
postgresqlmysql15(MariaDB only)oracle10gororacle12csqlserver
Process overview
Step 1: Get the Data Integration Operations Mart files
Verify the required content ZIP files are available for Step 6.
Verify the following information before proceeding:
If you performed an archive installation with a PostgreSQL repository, skip to Step 2: Run the setup scripts.
If you performed an archive installation with a MySQL, Microsoft SQL Server, or Oracle repository, and you do not have
pentaho-operations-mart-11.0.0.0-<build number>.zip, download it from the Support Portal.
Perform the following steps:
Download the
pentaho-server-ee-11.0.0.0-<build number>.zipfile from the Support Portal.On the Support Portal home page, sign in using the support credentials from your welcome packet.
In the Pentaho card, select Download.
In the 11.x list, select Pentaho 11.0 GA Release.
Scroll to the bottom of the release page.
In the file component section, select the
Operations Martfolder.Download the
pentaho-operations-mart-11.0.0.0-<build number>.zipfile.
Unpack the ZIP file to a temporary directory.
Verify these files exist in the temporary directory:
pentaho-operations-mart-operations-di-11.0.0.0-<build number>.zippentaho-operations-mart-operations-bi-11.0.0.0-<build number>.zip
Verify you also have the two files required for your repository database type:
Repository database typeRequired filesOracle
pentaho-operations-mart-etl-oracle10g-11.0.0.0-<build number>.zippentaho-operations-mart-clean-oracle10g-11.0.0.0-<build number>.zipMySQL5
pentaho-operations-mart-etl-mysql5-11.0.0.0-<build number>.zippentaho-operations-mart-clean-mysql5-11.0.0.0-<build number>.zipMicrosoft SQL Server
pentaho-operations-mart-etl-mssql-11.0.0.0-<build number>.zippentaho-operations-mart-clean-mssql-11.0.0.0-<build number>.zip
Step 2: Run the setup scripts
The required setup scripts are in:
<install-directory>/pentaho-server/data/<databasename>
<databasename> is one of:
postgresqlmysql5(MariaDB only)oracle10gororacle12csqlserver
Run these scripts in order:
pentaho_logging_<databasename>.sqlpentaho_mart_<databasename>.sql
Step 3: Set the global Kettle logging variables
Perform this step on the machine where you installed the PDI client and Pentaho Server.
When you run PDI for the first time, PDI creates kettle.properties in:
$USER_HOME/.kettle/kettle.properties
In Spoon, select Edit > Edit the kettle.properties file.
Add or edit the variables for the log tables.
If you customized these values in the SQL scripts, use your customized values.
For Oracle and Microsoft SQL Server, leave Value blank for variables that contain
_SCHEMA_.VariableValueKETTLE_CHANNEL_LOG_DBlive_logging_infoKETTLE_CHANNEL_LOG_TABLEchannel_logsKETTLE_CHANNEL_LOG_SCHEMApentaho_dilogsVariableValueKETTLE_JOBENTRY_LOG_DBlive_logging_infoKETTLE_JOBENTRY_LOG_TABLEjobentry_logsKETTLE_JOBENTRY_LOG_SCHEMApentaho_dilogsVariableValueKETTLE_JOB_LOG_DBlive_logging_infoKETTLE_JOB_LOG_TABLEjob_logsKETTLE_JOB_LOG_SCHEMApentaho_dilogsVariableValueKETTLE_METRICS_LOG_DBlive_logging_infoKETTLE_METRICS_LOG_TABLEmetrics_logsKETTLE_METRICS_LOG_SCHEMApentaho_dilogsVariableValueKETTLE_STEP_LOG_DBlive_logging_infoKETTLE_STEP_LOG_TABLEstep_logsKETTLE_STEP_LOG_SCHEMApentaho_dilogsVariableValueKETTLE_TRANS_LOG_DBlive_logging_infoKETTLE_TRANS_LOG_TABLEtrans_logsKETTLE_TRANS_LOG_SCHEMApentaho_dilogsVariableValueKETTLE_TRANS_PERFORMANCE_LOG_DBlive_logging_infoKETTLE_TRANS_PERFORMANCE_LOG_TABLEtransperf_logsKETTLE_TRANS_PERFORMANCE_LOG_SCHEMApentaho_dilogs
Step 4: Add the JNDI connections for logging
This section explains how to add the logging (live_logging_info) and Operations Mart (PDI_Operations_Mart) connections for a PDI client.
Go to
<pdi-install>/data-integration/simple-jndi.Open
jdbc.properties.Update the properties for your database type.
PostgreSQL:
MySQL:
MariaDB:
Oracle:
Microsoft SQL Server:
Step 5: Add a JDBC connection for the Pentaho Server
Perform this task on the machine where you installed the Pentaho Server.
Go to
<pentaho-server>/tomcat/webapps/pentaho/META-INF/.Open
context.xml.Update the resources for your database type.
Optional: To use encrypted passwords, replace any
factorysetting with:factory="org.pentaho.di.core.database.util.DecryptingDataSourceFactory"
PostgreSQL:
MySQL:
MariaDB:
Oracle:
Microsoft SQL Server:
Step 6: Add the DI Operations Mart ETL solutions to the Pentaho Repository default content folder
If you use PostgreSQL for the repository database, skip to Step 7: Initialize the DI Operations Mart.
Stop the Pentaho Server.
Locate the ZIP files containing the ETL solution and sample reports:
pentaho-operations-mart-operations-di-11.0.0.0-<build number>.zippentaho-operations-mart-operations-bi-11.0.0.0-<build number>.zip
Locate the two ZIP files that are specific to your repository type:
DirectoryFile names/oracle (10g or 12c)pentaho-operations-mart-etl-oracle10g-11.0.0.0-<build number>.zippentaho-operations-mart-clean-oracle10g-11.0.0.0-<build number>.zip/mysql5pentaho-operations-mart-etl-mysql5-11.0.0.0-<build number>.zippentaho-operations-mart-clean-mysql5-11.0.0.0-<build number>.zip/sqlserverpentaho-operations-mart-etl-mssql-11.0.0.0-<build number>.zippentaho-operations-mart-clean-mssql-11.0.0.0-<build number>.zipCopy all four ZIP files (DI, BI, mart-etl, mart-clean) into:
$PENTAHO_HOME/pentaho-server/pentaho-solution/system/default-contentStart the Pentaho Server.
After the server processes these files, it renames them with a timestamp.
Keep the renamed files in this directory.
Step 7: Initialize the DI Operations Mart
Launch Spoon.
Connect to the Pentaho Repository through the Pentaho Server.
Select File > Open.
Select Browse Files > Public > Pentaho Operations Mart > DI Ops Mart ETL.
Open each transformation and job.
In each transformation or job, open Properties and select the Logging tab.
Logging requires values for each Log connection field. A common approach is to use the global variables.
For job logging, set values for Job log table, Job entry log table, and Logging channel log table.
For transformation logging, set values for Transformation, Step, Performance, Logging channels, and Metrics.
Select File > Open.
Run Fill_in_DIM_DATE_and_DIM_TIME.
Run a few sample KTRs to generate log activity.
Select File > Open.
Run Update_Dimensions_then_Logging_Data.
Step 8: Verify the DI Operations Mart is working
In the Pentaho User Console, select Browse Files > Public > Pentaho Operations Mart > DI Audit Reports > Last_Run.
Verify the jobs and transformations ran.
Follow these instructions if you installed Pentaho using the manual method. See the Install Pentaho Data Integration and Analytics document for manual installation.
Before you begin
Installation depends on these prerequisites.
Manual installation of the Pentaho Server
These steps assume a Pentaho Server manual installation. See the Install Pentaho Data Integration and Analytics document if needed.
Required database
Ensure your Pentaho Server and Pentaho Repository use one of these databases:
PostgreSQL
MySQL or MariaDB
Oracle
Microsoft SQL Server
Data Integration Operations Mart scripts
You need these scripts:
pentaho_logging_<databasename>.sqlpentaho_mart_<databasename>.sql
<databasename> matches your repository database type:
postgresqlmysql15(MariaDB only)oracle10gororacle12csqlserver
Process overview
Manual Step 1: Get the Data Integration Operations Mart files
Download Operations Mart files from the Support Portal. You need the scripts that create the log tables.
Download
pentaho-operations-mart-11.0.0.0-<build number>.zip.Unpack the ZIP file to a temporary directory.
Locate
pentaho-operations-mart-ddl-11.0.0.0-<build number>.zip.Move it to
<pentaho-server>/data/and unpack it.
This writes the scripts into the repository database directory.
/postgresql
pentaho_mart_postgresql.sql
pentaho_logging_postgresql.sql
/mysql5
pentaho_mart_mysql.sql
pentaho_logging_mysql.sql
/oracle (10g or 12c)
pentaho_mart_oracle.sql
pentaho_logging_oracle.sql
/sqlserver
pentaho_mart_sqlserver.sql
pentaho_logging_sqlserver.sql
Manual Step 2: Run the setup scripts
Run these scripts in order:
pentaho_logging_<databasename>.sqlpentaho_mart_<databasename>.sql
Manual Step 3: Set the global Kettle logging variables
This step matches the archive installation instructions. Use Step 3: Set the global Kettle logging variables.
Manual Step 4: Add the JNDI connections for logging
This step matches the archive installation instructions. Use Step 4: Add the JNDI connections for logging.
Manual Step 5: Add a JDBC connection for the Pentaho Server
This step matches the archive installation instructions. Use Step 5: Add a JDBC connection for the Pentaho Server.
Manual Step 6: Add the DI Operations Mart ETL solutions to the Pentaho Repository default content folder
If you use PostgreSQL for the repository database, skip to Manual Step 7: Initialize the DI Operations Mart.
Stop the Pentaho Server.
Locate:
pentaho-operations-mart-operations-di-11.0.0.0-<build number>.zippentaho-operations-mart-operations-bi-11.0.0.0-<build number>.zip
Locate the two ZIP files specific to your repository database type.
Copy all four ZIP files into:
$PENTAHO_HOME/pentaho-server/pentaho-solution/system/default-contentStart the Pentaho Server.
After the server processes these files, it renames them with a timestamp.
Keep the renamed files in this directory.
Manual Step 7: Initialize the DI Operations Mart
This step matches the archive installation instructions. Use Step 7: Initialize the DI Operations Mart.
Manual Step 8: Verify the DI Operations Mart is working
This step matches the archive installation instructions. Use Step 8: Verify the DI Operations Mart is working.
Charts, reports, and dashboards
Once you create and populate your DI Operations Mart with log data, use the Pentaho User Console to examine this data and create reports, charts, and dashboards.
Pentaho provides pre-built reports, charts, and dashboards you can modify.
To understand the log fields, see Logging dimensions and metrics.
Logging tables status
The DI Operations Mart includes these tables.
Transformation log tables
The status values and their descriptions:
start
Indicates the transformation was started and retains this status until the transformation ends when no logging interval is set.
end
Transformation ended successfully.
stop
Indicates the user stopped the transformation.
error
Indicates an error occurred when attempting to run the transformation.
running
A transformation is only in this status directly after starting and does not appear without a logging interval.
paused
Indicates the user paused the transformation and does not appear without a logging interval.
Job log tables
The status values and their descriptions:
start
Indicates the job was started and retains this status until the job ends, and when no logging interval is set.
end
Job ended successfully.
stop
Indicates the user stopped the job.
error
Indicates an error occurred when attempting to run the job.
running
A job is only in this status directly after starting and does not appear without a logging interval.
paused
Indicates the user paused the job and it does not appear without a logging interval.
Logging dimensions and metrics
These tables identify the dimensions and metrics used to create ETL log charts and reports.
Fact table
(fact_execution)
execution_date_tk
A technical key (TK) linking the fact to the date when the transformation/job was executed.
execution_time_tk
A technical key (TK) linking the fact to the time-of-day when the transformation/job was executed.
batch_tk
A technical key (TK) linking the fact to batch information for the transformation/job.
execution_tk
A technical key (TK) linking the fact to execution information about the transformation/job.
executor_tk
A technical key (TK) linking the fact to information about the executor (transformation or job).
parent_executor_tk
A technical key (TK) linking the fact to information about the parent transformation/job.
root_executor_tk
A technical key (TK) linking the fact to information about the root transformation/job.
execution_timestamp
The date and time when the transformation/job was executed.
duration
The length of time (in seconds) between when the transformation was logged (LOGDATE) and the maximum dependency date (DEPDATE).
rows_input
The number of lines read from disk or the network by the specified step. Can be input from files, databases, etc.
rows_output
The number of rows output during the execution of the transformation/job.
rows_read
The number of rows read in from the input stream of the specified step.
rows_written
The number of rows written during the execution of the transformation/job.
rows_rejected
The number of rows rejected during the execution of the transformation/job.
errors
The number of errors that occurred during the execution of the transformation/job.
Batch dimension
(dim_batch)
batch_tk
A technical key (TK) for linking facts to batch information.
batch_id
The ID number for the batch.
logchannel_id
A string representing the identifier for the logging channel used by the batch.
parent_logchannel_id
A string representing the identifier for the parent logging channel used by the batch.
Date dimension
(dim_date)
date_tk
A technical key (TK) for linking facts to date information.
date_field
A Date object representing a particular day (year, month, day).
ymd
A string representing the date value in year-month-day format.
ym
A string representing the date value in year-month format.
year
An integer representing the year value.
quarter
An integer representing the number of the quarter (1-4) to which this date belongs.
quarter_code
A 2-character string representing the quarter (Q1-Q4) to which this date belongs.
month
An integer representing the number of the month (1-12) to which this date belongs.
month_desc
A string representing the month (for example, January) to which this date belongs.
month_code
A string representing the shortened month code (for example, JAN) to which this date belongs.
day
An integer representing the day of the month (1-31) to which this date belongs.
day_of_year
An integer representing the day of the year (1-366) to which this date belongs.
day_of_week
An integer representing the day of the week (1-7) to which this date belongs.
day_of_week_desc
A string representing the day of the week (for example, Sunday) to which this date belongs.
day_of_week_code
A string representing the shortened day-of-week code (for example, SUN) to which this date belongs.
week
An integer representing the week of the year (1-53) to which this date belongs.
Execution dimension
(dim_execution)
execution_tk
A technical key (TK) for linking facts to execution information.
execution_id
A unique string identifier for the execution.
server_name
The name of the server associated with the execution.
server_host
The name of the server associated with the execution.
executing_user
The name of the user who initiated the execution.
execution_status
The status of the execution (start, stop, end, error).
Executor dimension
The dim_executor table includes information about a job or transformation.
executor_tk
A technical key (TK) for linking facts to executor information.
version
An integer corresponding to the version of the executor.
date_from
A date representing the minimum date for which the executor is valid.
date_to
A date representing the maximum date for which the executor is valid.
executor_id
A string identifier for the executor.
executor_source
The source location (either file- or repository-relative) for the executor.
executor_environment *
File server, repository name, related to the executor_source.
executor_type
The executor type (for example, job or transformation).
executor_name
The name of the executor (for example, transformation name).
executor_desc
A string description of the executor (for example, job description).
executor_revision
A string representing the revision of the executor (for example, 1.3).
executor_version_label
A string representing a description of the revision (change comments).
exec_enabled_table_logging
Whether table logging is enabled for this executor. Values are Y if enabled; N otherwise.
exec_enabled_detailed_logging
Whether detailed (step or job entry) logging is enabled for this executor. Values are Y if enabled; N otherwise.
exec_enabled_perf_logging
Whether performance logging is enabled for this executor. Values are Y if enabled; N otherwise.
exec_enabled_history_logging
Whether historical logging is enabled for this executor. Values are Y if enabled; N otherwise.
last_updated_date
The date the executor was last updated.
last_updated_user
The name of the user who last updated the executor.
* Reserved for future use.
Log table
The dim_log_table table contains data used by Pentaho Operations Mart.
Do not modify the dim_log_table table. Use this table as reference only.
log_table_tk
A technical key (TK) for linking.
object_type
The type of PDI object being logged (for example, job, transformation, or step).
table_connection_name
The name of the database connection corresponding to the location of the transformation/job logging table.
table_name
The name of the table containing the transformation/job logging information.
schema_name
The name of the database schema corresponding to the location of the transformation/job logging table.
step_entry_table_conn_name
The name of the database connection corresponding to the location of the step/entry logging table.
step_entry_table_name
The name of the table containing the step/entry logging information.
step_entry_schema_name
The name of the database schema corresponding to the location of the step/entry logging table.
perf_table_conn_name
The name of the database connection corresponding to the location of the performance logging table.
perf_table_name
The name of the table containing the performance logging information.
perf_schema_name
The name of the database schema corresponding to the location of the performance logging table.
Time-of-day dimension
The dim_time table contains entries for every second of a day.
time_tk
A technical key (TK) for linking facts to time-of-day information.
hms
A string representing the time of day as hours-minutes-seconds (for example, 00:01:35).
hm
A string representing the time of day as hours-minutes (for example, 23:59).
ampm
Indicates whether the time is AM or PM. Values: am or pm.
hour
The hour of the day (0-23).
hour12
The hour of the day with respect to AM/PM (1-12).
minute
The minute of the hour (0-59).
second
The second of the minute (0-59).
Step fact table
The fact_step_execution table contains facts about individual step executions.
execution_date_tk
A technical key (TK) linking the fact to the date when the step was executed.
execution_time_tk
A technical key (TK) linking the fact to the time-of-day when the step was executed.
batch_tk
A technical key (TK) linking the fact to batch information for the step.
executor_tk
A technical key (TK) linking the fact to information about the executor (transformation).
parent_executor_tk
A technical key (TK) linking the fact to information about the parent transformation.
root_executor_tk
A technical key (TK) linking the fact to information about the root transformation/job.
execution_timestamp
The date and time when the step was executed.
step_tk
A technical key (TK) linking the fact to information about the step.
step_copy
The step copy number. This is zero if there is only one copy of the step, or (0 to N-1) if N copies are executed.
rows_input
The number of lines read from disk or the network by the step.
rows_output
The number of lines written to disk or the network by the step.
rows_read
The number of rows read from previous steps during the interval.
rows_written
The number of rows written to following steps during the interval.
rows_rejected
The number of rows rejected by the step’s error handling during the interval.
errors
The number of errors that occurred during the execution of the step.
Step dimension
The dim_step table contains information about individual steps and job entries.
step_tk
A technical key (TK) for linking facts to step/entry information.
step_id
The string name of the step/entry.
original_step_name *
The name of the step/entry template used to create this step/entry (for example, Table Input).
* Reserved for future use.
Job entry fact table
The fact_jobentry_execution table contains facts about individual job entry executions.
execution_date_tk
A technical key (TK) linking the fact to the date when the job entry was executed.
execution_time_tk
A technical key (TK) linking the fact to the time-of-day when the job entry was executed.
batch_tk
A technical key (TK) linking the fact to batch information for the job entry.
executor_tk
A technical key (TK) linking the fact to information about the executor (transformation or job).
parent_executor_tk
A technical key (TK) linking the fact to information about the parent transformation/job.
root_executor_tk
A technical key (TK) linking the fact to information about the root transformation/job.
step_tk
A technical key (TK) linking the fact to information about the job entry.
execution_timestamp
The date and time when the job entry was executed.
rows_input
The number of lines read from disk or the network by the job entry.
rows_output
The number of lines written to disk or the network by the job entry.
rows_read
The number of rows read in from the input stream of the job entry.
rows_written
The number of rows written to the output stream of the job entry.
rows_rejected
The number of rows rejected during the execution of the job entry.
errors
The number of errors that occurred during the execution of the job entry.
result
Whether the job entry finished successfully. Values: Y (successful) or N (otherwise).
nr_result_rows
The number of result rows after execution.
nr_result_files
The number of result files after execution.
Execution performance fact table
The fact_perf_execution table contains facts about performance snapshots for transformation executions.
execution_date_tk
A technical key (TK) linking the fact to the date when the transformation was executed.
execution_time_tk
A technical key (TK) linking the fact to the time-of-day when the transformation was executed.
batch_tk
A technical key (TK) linking the fact to batch information for the transformation.
executor_tk
A technical key (TK) linking the fact to information about the executor (transformation).
parent_executor_tk
A technical key (TK) linking the fact to information about the parent transformation/job.
root_executor_tk
A technical key (TK) linking the fact to information about the root transformation/job.
step_tk
A technical key (TK) linking the fact to information about the transformation/job.
seq_nr
The sequence number differentiating snapshots for a single execution.
step_copy
The step copy number. This is zero if there is only one copy, or (0 to N-1) if N copies are executed.
execution_timestamp
The date and time when the transformation was executed.
rows_input
The number of rows read from the input file, database, or network during the interval.
rows_output
The number of rows written to output file, database, or network during the interval.
rows_read
The number of rows read from previous steps during the interval.
rows_written
The number of rows written to following steps during the interval.
rows_rejected
The number of rows rejected by the steps error handling during the interval.
errors
The number of errors that occurred during the execution of the transformation/job.
input_buffer_rows
The size of the step’s input buffer in rows at the time of the snapshot.
output_buffer_rows
The size of the output buffer in rows at the time of the snapshot.
Clean up the DI Operations Mart tables
Cleaning the DI Operations Mart consists of running either a job or transformation that deletes data older than a specified maximum age.
The cleanup job and transformation are in the etl folder.
In Spoon, open one of the following:
Clean_up_PDI_Operations_Mart.kjb(job)Clean_up_PDI_Operations_Mart_fact_table.ktr(transformation)
Set these parameters:
max.age.days (required): Maximum data age, in days.
schema.prefix (optional): For PostgreSQL, set
<schema>.(with trailing period). For other databases, leave blank.
Run the job or transformation.
Running the job or transformation deletes data older than the maximum age.
See the Pentaho Data Integration document for details on scheduling regular cleanup.
Last updated
Was this helpful?

