LogoLogo
CtrlK
Try Pentaho Data Integration and Analytics
9.3 Data Integration
  • Pentaho Documentation
  • Pentaho Data Integration
  • Get started with the PDI client
    • Starting the PDI client
    • Use the PDI client perspectives
    • Customize the PDI client
  • Use a Pentaho Repository in PDI
    • Create a connection in the PDI client
    • Connect to a Pentaho Repository
    • Manage repositories in the PDI client
    • Unsupported repositories
    • Use the Repository Explorer
      • Access the Repository Explorer window
      • Create a new folder in the repository
      • Open a folder, job, or transformation
      • Rename a folder, job, or transformation
      • Move objects
      • Restore objects
      • Delete a folder, job, or transformation
      • Use Pentaho Repository access control
      • Use version history
  • Data Integration perspective in the PDI client
    • Basic concepts of PDI
    • Work with transformations
      • Create a transformation
      • Open a transformation
      • Save a transformation
      • Run your transformation
        • Run configurations
          • Select an Engine
        • Options
        • Parameters and Variables
        • Analyze your transformation results
        • Inspect your data
          • Get started
          • Tour the environment
          • Use visualizations
            • Save your inspection session
            • Use tabs to create multiple visualizations
            • Visualization types
          • Use filters to explore your data
            • Drill down into your visualization
            • Keep or exclude selected data in your visualization
            • Add a filter using the Filters panel
            • Remove a filter from the Filters panel
            • Filter examples
              • Work with filters in Stream View
              • Work with filters in Model View
              • Filter functions
            • Keyboard shortcuts for filter options
          • Publish for collaboration
      • Stop your transformation
      • Use the Transformation menu
      • Adjust transformation properties
    • Work with jobs
      • Create a job
      • Open a job
      • Save a job
      • Run your job
        • Run configurations
          • Pentaho engine
        • Options
        • Parameters and variables
      • Stop your job
      • Use the Job menu
        • Adjust job properties
    • Add notes to transformations and jobs
      • Create a note
      • Edit a note
      • Reposition a note
      • Delete a note
    • Adaptive Execution Layer
    • Connecting to Virtual File Systems
      • Before you begin
        • Access to Google Cloud
        • Access to HCP
        • Access to Pentaho Data Catalog
        • Access to Microsoft Azure
      • Create a VFS connection
      • Edit a VFS connection
      • Delete a VFS connection
      • Access files with a VFS connection
      • Pentaho address to a VFS connection
      • Steps and entries supporting VFS connections
      • VFS browser
        • Before you begin
          • Access to a Google Drive
          • Set up HCP credentials
        • Access files with the VFS browser
        • Supported steps and entries
        • Configure VFS options
    • Logging and performance monitoring
      • Set up transformation logging
      • Set up job logging
      • Logging levels
      • Monitor performance
        • Sniff Test tool
        • Monitoring tab
        • Use performance graphs
      • PDI performance tuning tips
      • Logging best practices
    • Advanced topics
      • Understanding PDI data types and field metadata
        • Data type mappings
          • Using the correct data type for math operations
        • Using the fields table properties
          • Applying formatting
          • Applying calculations and rounding
        • Output type examples
      • PDI run modifiers
        • Arguments
        • Parameters
          • VFS properties
            • Specifying VFS properties as parameters
            • Configure SFTP VFS
        • Variables
          • Environment variables
          • Kettle Variables
            • Set Kettle variables in the PDI client
            • Set Kettle variables manually
            • Set Kettle or Java environment variables in the Pentaho MapReduce job entry
            • Set the LAZY_REPOSITORY variable in the PDI client
          • Internal variables
      • Use checkpoints to restart jobs
      • Use the SQL Editor
      • Use the Database Explorer
      • Transactional databases and job rollback
        • Make a transformation database transactional
        • Make a job database transactional
      • Web services steps
  • Schedule perspective in the PDI client
    • Schedule a transformation or job
    • Edit a scheduled run of a transformation or job
    • Stop a schedule from running
    • Enable or disable a schedule from running
    • Delete a scheduled run of a transformation or job
    • Refresh the schedule list
  • Streaming analytics
    • Get started with streaming analytics in PDI
    • Data ingestion
    • Data processing
  • Advanced topics
    • PDI and Hitachi Content Platform (HCP)
    • PDI and Data Catalog
      • Prerequisites
      • Supported Filetypes
    • PDI and Snowflake
      • Snowflake job entries in PDI
    • Copybook steps in PDI
      • Copybook transformation steps in PDI
    • Work with the Streamlined Data Refinery
      • How does SDR work?
        • App Builder, CDE, and CTools
          • Get started with App Builder
          • Community Dashboard Editor and CTools
      • Install and configure the Streamlined Data Refinery
        • Installing and configuring the SDR sample
          • Install Pentaho software
          • Download and install the SDR sample
        • Configure KTR files for your environment
        • Clean up the All Requests Processed list
        • Install the Vertica JDBC driver
        • Use Hadoop with the SDR
        • App endpoints for SDR forms
        • App Builder and Community Dashboard Editor
          • Get started with App Builder
          • Community Dashboard Editor and CTools
      • Use the Streamlined Data Refinery
        • How to use the SDR sample form
          • Edit the Movie Ratings - SDR Sample form
        • Building blocks for the SDR
          • Use the Build Model job entry for SDR
            • Create a Build Model job entry
            • Select existing model options
            • Variables for Build Model job entry
          • Using the Annotate Stream step
            • Use the Annotate Stream step
              • Creating measures on stream fields
                • Create a measure on a stream field
              • Creating attributes
                • Create an attribute on a field
              • Creating link dimensions
                • Create a link dimension
                • Create a dimension key
            • Creating annotation groups
              • Create an annotation group for sharing with other users
              • Create an annotation group locally
            • Metadata injection support
          • Using the Shared Dimension step for SDR
            • Create a shared dimension
            • Create a dimension key in Shared Dimension step
            • Metadata injection support
          • Using the Publish Model job entry for SDR
            • Use the Publish Model job entry
    • Use Command Line Tools to Run Transformations and Jobs
      • Pan Options and Syntax
      • Pan Status Codes
      • Kitchen Options and Syntax
      • Kitchen Status Codes
      • Import KJB or KTR Files From a Zip Archive
      • Connect to a Repository with Command-Line Tools
      • Export Content from Repositories with Command-Line Tools
    • Using Pan and Kitchen with a Hadoop cluster
      • Using the PDI client
      • Using the Pentaho Server
    • Use Carte Clusters
      • About Carte Clusters
      • Set Up a Carte Cluster
        • Carte Cluster Configuration
          • Configure a static Carte cluster
          • Configure a Dynamic Carte Cluster
            • Configure a Carte Master Server
            • Configure Carte Slave Servers
            • Tuning Options
          • Configuring Carte Servers for SSL
          • Change Jetty Server Parameters
            • In the Carte Configuration file
            • In the Kettle Configuration file
        • Initialize Slave Servers
        • Create a cluster schema
        • Run transformations in a cluster
      • Schedule Jobs to Run on a Remote Carte Server
      • Stop Carte from the Command Line Interface or URL
      • Run Transformations and Jobs from the Repository on the Carte Server
    • Connecting to a Hadoop cluster with the PDI client
      • Audience and prerequisites
      • Using the Apache Hadoop driver
      • Install a driver for the PDI client
      • Adding a cluster connection
        • Add a cluster connection by import
        • Add a cluster connection manually
        • Add security to cluster connections
          • Specify Kerberos security
          • Specify Knox security
        • Configure and test connection
      • Managing Hadoop cluster connections
        • Edit Hadoop cluster connections
        • Duplicate a Hadoop cluster connection
        • Delete a Hadoop cluster connection
      • Connect other Pentaho components to a cluster
    • Adaptive Execution Layer
      • Set Up AEL
      • Use AEL
        • Recommended PDI steps to use with Spark on AEL
      • Vendor-specific setups for Spark
      • AEL logging
        • Activating AEL logging
        • Configuring AEL logging
          • Modify the XML file
      • Advanced topics
      • Troubleshooting
    • Partitioning data
      • Get started
        • Partitioning during data processing
        • Understand repartitioning logic
        • Partitioning data over tables
      • Use partitioning
        • Use data swimlanes
        • Rules for partitioning
      • Partitioning clustered transformations
      • Learn more
    • Pentaho Data Services
      • Creating a regular or streaming Pentaho Data Service
        • Data service badge
      • Open or edit a Pentaho Data Service
      • Delete a Pentaho Data Service
      • Test a Pentaho Data Service
        • Run a basic test
        • Run a streaming optimization test
        • Run an optimization test
        • Examine test results
          • Pentaho Data Service SQL support reference and other development considerations
            • Supported SQL literals
            • Supported SQL clauses
            • Other development considerations
      • Optimize a Pentaho Data Service
        • Apply the service cache optimization
          • How the service cache optimization technique works
          • Adjust the cache duration
          • Disable the cache
          • Clear the cache
        • Apply a query pushdown optimization
          • How the query pushdown optimization technique works
          • Add the query pushdown parameter to the Table Input or MongoDB Input steps
          • Set up query pushdown parameter optimization
          • Disable the query pushdown optimization
        • Apply a parameter pushdown optimization
          • How the parameter pushdown optimization technique works
          • Add the parameter pushdown parameter to the step
          • Set up parameter pushdown optimization
        • Apply streaming optimization
          • How the streaming optimization technique works
          • Adjust the row or time limits
      • Publish a Pentaho Data Service
      • Share a Pentaho Data Service with others
        • Share a Pentaho Data Service with others
        • Connect to the Pentaho Data Service from a Pentaho tool
        • Connect to the Pentaho Data Service from a Non-Pentaho tool
          • Step 1: Download the Pentaho Data Service JDBC driver
          • Step 2: Install the Pentaho Data Service JDBC driver
          • Step 3: Create a connection from a non-Pentaho tool
        • Query a Pentaho Data Service
          • Example
      • Monitor a Pentaho Data Service
    • Data lineage
      • Sample use cases
      • Architecture
      • Setup
      • API
      • Steps and entries with custom data lineage analyzers
      • Contribute additional step and job entry analyzers to the Pentaho Metaverse
        • Examples
          • Create a new Maven project
          • Add dependencies
          • Create a class which implements IStepAnalyzer
          • Create the Blueprint configuration
          • Build and test your bundle
          • See it in action
        • Different types of step analyzers
          • Field manipulation
          • External resource
          • Connection-based external resource
          • Adding analyzers from existing PDI plug-ins (non-OSGi)
    • Use the Pentaho Marketplace to manage plugins
      • View installed plugins and versions
      • Install plugins
    • Customize PDI Data Explorer
      • Use discrete axis for line, area, and scatter charts
        • Set discrete axes for time dimensions in PDI Data Explorer
        • Set discrete axes for number dimensions in PDI Data Explorer
  • Troubleshooting possible data integration issues
    • Troubleshooting transformation steps and job entries
      • 'Missing plugins' error when a transformation or job is opened
      • Cannot execute or modify a transformation or job
      • Step is already on canvas error
    • Troubleshooting database connections
      • Unsupported databases
      • Database locks when reading and updating from a single table
      • Force PDI to use DATE instead of TIMESTAMP in Parameterized SQL queries
      • PDI does not recognize changes made to a table
    • Jobs scheduled on Pentaho Server cannot execute transformation on remote Carte server
    • Cannot run a job in a repository on a Carte instance from another job
    • Troubleshoot Pentaho data service issues
    • Kitchen and Pan cannot read files from a ZIP export
    • Using ODBC
    • Improving performance when writing multiple files
    • Snowflake timeout errors
    • Log table data is not deleted
    • Data Catalog searches returning incomplete or missing data
  • PDI transformation steps
    • Abort
      • General
      • Options
      • Logging
    • Add a Checksum
      • Options
      • Example
      • Metadata injection support
    • Add sequence
      • General
      • Database generated sequence
      • PDI transformation counter generated sequence
    • AMQP Consumer
      • Before You begin
      • General
        • Create and save a new child transformation
      • Options
        • Setup tab
        • Create a new AMQP Message Queue
        • Use an existing AMQP Message Queue
          • Specify Routing Keys
          • Specify Headers
        • Security tab
        • Batch tab
        • Fields tab
        • Result Fields tab
      • Metadata injection support
      • See also
    • AMQP Producer
      • Before you begin
      • General
      • Options
        • Setup tab
        • Security tab
      • Metadata injection support
      • See also
    • Avro Input
      • Select an engine
      • Using the Avro Input step on the Pentaho engine
        • General
        • Options
          • Source tab
            • Embedded schema
            • Separate schema
          • Avro Fields tab
          • Lookup Fields tab
            • Sample transformation walkthrough using the Lookup field
        • Metadata injection support
      • Using the Avro Input step on the Spark engine
        • General
        • Options
        • Metadata injection support
    • Avro Output
      • Select an engine
      • Using the Avro Output step on the Pentaho engine
        • General
        • Options
          • Fields tab
          • Schema tab
          • Options tab
        • Metadata injection support
      • Using the Avro Output step on the Spark engine
        • General
        • Options
        • Metadata injection support
    • Calculator
      • General
      • Options
        • Calculator functions list
      • Troubleshooting the Calculator step
        • Length and precision
        • Data Types
        • Rounding method for the Round (A, B) function
    • Cassandra Input
      • AEL considerations
      • Options
        • CQL SELECT query
        • WHERE Clause
      • Metadata injection support
    • Cassandra Output
      • AEL Considerations
      • General
      • Options
        • Connection tab
        • Write options tab
          • Schema options tab
          • Update table metadata
          • Pre-Insert CQL
      • Metadata injection support
    • Catalog Input
      • Before you begin
      • General
        • Input tab
        • Fields tab
          • CSV fields
          • Parquet fields
            • PDI types
    • Catalog Output
      • Before you begin
        • General
          • File tab
          • Metadata tab
          • Fields tab
            • CSV fields
            • Parquet fields
            • Options tab
              • CSV options
              • Parquet options
    • Common Formats
      • Date formats
      • Number formats
    • Copybook Input
      • Before you begin
      • General
      • Options
        • Input tab
        • Output tab
        • Options tab
      • Use Error Handling
      • Metadata injection support
    • CouchDB Input
      • AEL Considerations
      • Options
      • Metadata injection support
    • CSV File Input
      • Options
      • Fields
      • Metadata injection support
    • Data types
    • Delete
      • General
      • The key(s) to look up the value(s) table
      • Metadata injection support
    • ElasticSearch Bulk Insert (deprecated)
      • Before you begin
      • General
      • Options
        • General tab
        • Servers tab
        • Fields tab
        • Settings tab
      • Reference information
    • Elasticsearch REST Bulk Insert
      • Before you begin
      • General
      • Options
        • General tab
        • Document tab
          • Creating a document to index with stream field data
          • Using an existing JSON document from a field
        • Output tab
    • ETL metadata injection
      • General
      • Options
        • Inject Metadata tab
          • Specify the source field
          • Injecting metadata into the ETL Metadata Injection step
        • Options tab
      • Example
        • Input data
        • Transformations
        • Results
      • Reference links
        • Articles
        • Video
      • Steps supporting metadata injection
    • Execute Row SQL Script
      • General
        • Output fields
      • Metadata injection support
    • Execute SQL Script
      • Notes
      • General
      • Options
        • Optional statistic fields
      • Example
      • Metadata injection support
    • File exists (Step)
    • Get records from stream
      • General
      • Options
      • Metadata injection support
      • See also
    • Get rows from result
      • General
      • Options
      • Metadata injection support
    • Get System Info
      • General
        • Data types
      • Metadata injection support
    • Group By
      • Select an engine
        • Using the Group By step on the Pentaho engine
          • General
            • The fields that make up the group table
            • Aggregates table
          • Examples
          • Metadata Injection Support
        • Using the Group By step on the Spark engine
          • General
            • The fields that make up the group table
            • Aggregates table
          • Examples
          • Metadata Injection Support
    • Hadoop File Input
      • Select an engine
        • Using the Hadoop File Input step on the Pentaho engine
          • General
          • Options
            • File tab
              • Accepting file names from a previous step
              • Show action buttons
              • Selecting a file using regular expressions
            • Open file
            • Content tab
            • Error Handling tab
            • Filters tab
            • Fields tab
              • Number formats
              • Scientific notation
              • Date formats
          • Metadata injection support
        • Using the Hadoop File Input step on the Spark engine
          • General
          • Options
          • Metadata injection support
    • Hadoop File Output
      • Select an Engine
        • Using the Hadoop File Output step on the Pentaho engine
          • General
          • Options
            • File tab
            • Content tab
            • Fields tab
          • Metadata injection support
        • Using the Hadoop File Output step on the Spark engine
          • General
          • Options
          • Metadata injection support
    • HBase Input
      • Select an engine
        • Using the HBase Input step on the Pentaho engine
          • General
          • Options
            • Configure query tab
              • Key fields table
            • Create/Edit mappings tab
              • Fields
              • Additional notes on data types
            • Filter result set tab
              • Fields
          • Namespaces
          • Performance considerations
          • Metadata injection support
        • Using the HBase Input step on the Spark engine
          • General
          • Options
          • Metadata injection support
    • HBase Output
      • Select an Engine
        • Using the HBase Output step on the Pentaho engine
          • General
          • Options
            • Configure connection tab
            • Create/Edit mappings tab
          • Performance considerations
          • Metadata injection support
        • Using the HBase Output step on the Spark engine
          • General
          • Options
          • Performance considerations
          • Metadata injection support
    • HBase row decoder
      • General
      • Options
        • Configure fields tab
        • Create/Edit mappings tab
          • Key fields table
            • Additional notes on data types
      • Using HBase Row Decoder with Pentaho MapReduce
      • Metadata injection support
    • HBase setup for Spark
      • Set up the application properties file
      • Set up the vendor-specified JARs
      • Using HBase steps with Amazon EMR 5.21
        • Specify the parameter in the properties file
        • Specify the parameter in Transformation properties
        • Specify the parameter as an environment variable in PDI
    • Java filter
      • General
      • Options
      • Filter expression examples
    • JMS Consumer
      • Before you begin
      • General
        • Create and save a new child transformation
      • Options
        • Setup tab
        • Security tab
        • Batch tab
        • Fields tab
        • Result fields tab
      • Metadata injection support
      • See also
    • JMS Producer
      • Before you begin
      • General
      • JMS connection information
        • Setup tab
        • Security tab
        • Options tab
        • Properties tab
      • Metadata injection support
      • See also
    • Job Executor
      • Samples
      • General
      • Options
        • Parameters tab
        • Execution results tab
        • Row grouping tab
        • Results rows tab
        • Result files tab
    • JSON Input
      • General
      • Options
        • File tab
          • Selected files table
        • Content tab
        • Fields tab
          • Select fields
        • Additional output fields tab
      • Examples
      • Metadata injection support
    • Kafka consumer
      • AEL considerations
      • General
        • Create and save a new child transformation
      • Options
        • Setup tab
        • Batch tab
        • Fields tab
        • Result fields tab
        • Options tab
      • Metadata injection support
      • See also
    • Kafka Producer
      • General
      • Options
        • Setup tab
        • Options tab
      • Metadata injection support
      • See also
    • Kinesis Consumer
      • AEL considerations
      • General
        • Create and save a new child transformation
      • Options
        • Setup tab
        • Batch tab
        • Fields tab
        • Result fields tab
        • Options tab
      • Metadata injection support
      • See also
    • Kinesis Producer
      • AEL considerations
      • General
      • Options
        • Setup tab
        • Options tab
      • Metadata injection support
      • See also
    • Mapping
      • General
        • Log lines in Kettle
      • Options
        • Parameters tab
        • Input tab
        • Add inputs to table
        • Output tab
      • Mapping Input Specification
        • Options
      • Mapping Output Specification
        • Options
      • Samples
    • MapReduce Input
      • AEL considerations
      • Options
      • Metadata injection support
    • MapReduce Output
      • AEL Considerations
      • Options
      • Metadata injection support
    • Memory Group By
      • General
        • The fields that make up the Group Table
        • Aggregates table
      • Metadata injection support
    • Merge rows (diff)
      • Select an Engine
        • Using Merge rows (diff) on the Pentaho engine
          • General
          • Options
          • Examples
          • Metadata injection support
        • Using Merge rows (diff) on the Spark engine
          • General
          • Options
          • Examples
          • Metadata injection support
    • Microsoft Excel Input
      • General
      • Options
        • Files tab
          • Selected files table
        • Sheets tab
        • Content tab
        • Error Handling tab
        • Fields tab
        • Additional output fields tab
      • Metadata injection support
    • Microsoft Excel Output
      • Options
        • Content tab
        • Custom tab
          • Header font
          • Row font
        • Fields tab
      • Metadata injection support
    • Microsoft Excel Writer
      • General
      • Options
        • File & Sheet tab
          • File panel
          • Sheet panel
          • Template panel
        • Content Tab
          • Content options panel
          • When writing to existing sheet panel
          • Fields panel
      • Metadata injection support
    • Modified Java Script Value
      • General
      • Java script functions pane
      • Java Script pane
        • Script types
        • Fields table
        • Modify values
      • JavaScript Internal API Objects
      • Examples
        • Check for the existence of fields in a row
        • Add a new field in a row
        • Use NVL in JavaScript
        • Split fields
        • Comparing values
        • String values
        • Numeric values
        • Filter rows
      • Sample transformations
    • Mondrian Input
      • General
    • MongoDB Input
      • AEL considerations
      • General
      • Options
        • Configure connection tab
        • Input options tab
          • Tag set specification table
        • Query tab
        • Fields tab
      • Examples
        • Query expression
        • Aggregate pipeline
      • Metadata injection support
    • MongoDB Output
      • AEL considerations
      • General
      • Options
        • Configure connection tab
        • Output options tab
        • Mongo document fields tab
          • Example
            • Input data
            • Document field definitions
            • Document structure
        • Create/drop indexes tab
          • Create/drop indexes example
      • Metadata injection support
    • MQTT Consumer
      • Select an engine
        • Using the MQTT Consumer step on the Pentaho engine
          • General
            • Create and save a new child transformation
          • Options
            • Setup tab
            • Security tab
            • Batch Tab
            • Fields tab
            • Result fields tab
            • Options tab
          • Metadata injection support
        • Using the MQTT Consumer step on the Spark engine
          • General
          • Options
          • Using MQTT with SSL on AEL
          • Metadata injection support
    • MQTT Producer
      • General
      • Options
        • Setup tab
        • Security tab
        • Options tab
      • Metadata injection support
      • See also
    • ORC Input
      • Select an engine
        • Using the ORC Input step on the Pentaho engine
          • Options
            • Fields
              • ORC types
          • Metadata injection support
        • Using the ORC Input step on the Spark engine
          • Options
            • Fields
              • AEL types
          • Metadata injection support
    • ORC Output
      • Select an Engine
        • Using the ORC Output step on the Pentaho engine
          • General
          • Options
            • Fields tab
              • ORC types
            • Options tab
          • Metadata injection support
        • Using the ORC Output step on the Spark engine
          • General
          • Options
          • Metadata injection support
    • Parquet Input
      • Select an Engine
        • Using Parquet Input on the Pentaho engine
          • General
            • Fields
              • PDI types
          • Metadata injection support
        • Using Parquet Input on the Spark engine
          • General
            • Fields
              • Using Get Fields with Parquet partitioned datasets
              • Spark types
          • Metadata injection support
    • Parquet Output
      • AEL considerations
      • General
      • Options
        • Fields tab
        • Options tab
      • Metadata injection support
    • Pentaho Reporting Output
      • General
      • Metadata injection support
    • Python Executor
      • Before you begin
      • General
      • Options
        • Script tab
          • Source panel
        • Input tab
          • Row by row processing
          • All rows processing
          • Mapping data types from PDI to Python
        • Output tab
          • Variable to fields processing
          • Frames to fields processing
          • Mapping data types from Python to PDI
    • Query HCP
      • Before you begin
      • General
      • Options
        • Query tab
        • Output tab
      • See also
    • Read metadata from HCP
      • General
      • Options
      • See also
    • Read metadata from Copybook
      • General
      • Example
      • Metadata injection support
    • Read Metadata
      • Before you begin
      • General
      • Options
        • Specific Resources
        • Search Criteria
        • Advanced Search
    • Regex Evaluation
      • General
        • Capture Group Fields table
      • Options
        • Settings tab
          • Regular expression evaluation window
        • Content tab
      • Examples
    • Replace in String
      • General
      • Fields string table
      • Example: Using regular expression group references
      • Metadata injection support
      • See also
    • REST Client
      • General
      • Options
        • General tab
        • Authentication tab
        • SSL tab
        • Headers tab
        • Parameters tab
        • Matrix Parameters tab
    • Row Denormaliser
      • General
        • Group field table
        • Target fields table
      • Examples
      • Metadata injection support
    • Row Flattener
      • General
      • Example
    • Row Normaliser
      • General
        • Fields table
      • Examples
      • Metadata injection support
    • S3 CSV Input
      • Options
        • Fields
      • AWS credentials
      • Metadata injection support
      • See also
    • S3 File Output
      • Big Data warning
      • General
      • Options
        • File tab
        • Content tab
        • Fields tab
      • AWS credentials
      • Metadata injection support
      • See also
    • Salesforce Delete
      • General
      • Options
        • Connection
        • Settings
    • Salesforce Input
      • General
      • Options
        • Settings tab
          • Connection
          • Settings
        • Content tab
          • Advanced
          • Additional fields
          • Other Fields
        • Fields tab
      • Metadata injection support
    • Salesforce Insert
      • General
      • Options
        • Connection
        • Settings
        • Output Fields
        • Fields
    • Salesforce Update
      • General
      • Options
        • Connection
        • Settings
        • Fields
    • Salesforce Upsert
      • General
      • Options
        • Connection
        • Settings
        • Output Fields
        • Fields
    • Select Values
      • General
      • Options
        • Select & Alter tab
        • Edit Mapping
        • Remove tab
        • Meta-data tab
      • Examples
      • Metadata injection support
    • Set Field Value
      • General
      • Options
      • Metadata injection support
    • Set Field Value to a Constant
      • General
      • Options
      • Metadata Injection Support
    • Simple Mapping (sub-transformation)
      • General
        • Log lines in Kettle
      • Options
        • Parameters tab
        • Input tab
        • Output tab
    • Single Threader
      • General
      • Options
        • Options tab
        • Parameters tab
    • Sort rows
      • General
        • Options
        • Fields column settings
      • Metadata injection support
    • Split Fields
      • General
      • Fields table
      • Example
      • Metadata injection support
    • Splunk Input
      • Prerequisites
      • AEL considerations
      • General
      • Options
        • Connection tab
        • Fields tab
      • Raw field parsing
      • Date handling
      • Metadata injection support
    • Splunk Output
      • Prerequisites
      • AEL considerations
      • General
      • Options
        • Connection tab
        • Event tab
      • Metadata injection support
    • SSTable Output
      • AEL considerations
      • Options
    • String Operations
      • General
      • The fields to process
      • Metadata injection support
    • Strings cut
      • General
      • The fields to cut
      • Example
      • Metadata injection support
    • Switch-Case
      • Options
      • Example
      • Metadata injection support
    • Table Input
      • AEL considerations
        • Connect to a Hive database
        • Connect to an Impala database
      • General
      • Options
      • Example
      • Metadata injection support
    • Table Output
      • AEL considerations
        • Connect to a Hive database
      • General
      • Options
        • Main options tab
        • Database fields tab
        • Enter Mapping window
      • Metadata injection support
    • Using Table input to Table output steps with AEL for managed tables in Hive
      • Create separate input and output KTRs
      • Create a job to join the KTRs
    • Text File Input
      • Select an engine
        • Using the Text File Input step on the Pentaho engine
          • General
          • Options
            • File tab
              • Regular expressions
              • Selected files table
              • Accept file names
              • Show action buttons
            • Content tab
            • Error Handling tab
            • Filters tab
            • Fields tab
            • Additional output fields tab
          • Metadata injection support
        • Using the Text File Input step on the Spark engine
          • General
          • Options
          • Metadata injection support
    • Text File Output
      • Select an engine
        • Text File Output
          • General
          • Options
            • File tab
            • Content tab
            • Fields tab
          • See also
          • Metadata injection support
        • Using the Text File Output step on the Spark engine
          • General
          • Options
          • See also
          • Metadata injection support
    • Transformation Executor
      • Error handling and parent transformation logging notes
      • Samples
      • General
      • Options
        • Parameters tab
          • Order of processing
        • Execution results tab
        • Row grouping tab
        • Result rows tab
        • Result files tab
    • Unique Rows
      • Select an engine
        • Using the Unique Rows step on the Pentaho engine
          • General
            • Settings
          • See also
        • Using the Unique Rows step on the Spark engine
          • General
          • See also
    • Unique Rows (HashSet)
      • General
        • Settings
      • See also
    • User Defined Java Class
      • Not complete Java
      • General (User Defined Java Class)
        • Class Code (User Defined Java Class)
          • Process rows
          • Error handling
          • Logging
        • Class and code fragments
      • Options
        • Fields tab
        • Parameters tab
        • Info steps tab
        • Target steps tab
      • Examples
      • Metadata injection support
    • Write metadata to HCP
      • General
      • Options
      • See also
    • Write Metadata
      • Before you begin
      • General
      • Options
        • Input tab
        • Metadata tab
    • XML Input Stream (StAX)
      • Samples
      • Options
      • Element blocks example
    • XML Output
      • General
      • Options
        • File tab
        • Content tab
        • Fields tab
      • Metadata injection support
  • PDI job entries
    • Amazon EMR Job Executor
      • Before you begin
      • General
      • Options
        • EMR settings tab
          • AWS connection
          • Cluster
        • Job settings tab
    • Amazon Hive Job Executor
      • Before you begin
      • General
      • Options
        • Hive settings tab
          • AWS connection
          • Cluster
        • Job settings tab
    • Bulk load into Amazon Redshift
      • Before you begin
      • General
      • Options
        • Input tab
        • Output tab
        • Options tab
        • Parameters tab
    • Bulk load into Azure SQL DB
      • Before you begin
      • General
      • Options
        • Input tab
        • Output tab
        • Options tab
        • Advanced options tab
    • Bulk load into Databricks
      • General
      • Options
        • Input tab
        • Output tab
    • Bulk load into Snowflake
      • Before you begin
      • General
      • Options
        • Input tab
        • Output tab
        • Options tab
        • Advanced options tab
    • Create Snowflake warehouse
      • General
      • Options
        • Database connection and warehouse
        • Warehouse settings
        • Cluster settings
        • Activity settings
    • Delete Snowflake warehouse
      • General
      • Options
    • File Exists (Job Entry)
    • Google BigQuery loader
      • Before you begin
      • General
      • Options
        • Setup tab
        • File tab
    • Hadoop Copy Files
      • General
      • Options
        • Files/Folders tab
        • Settings tab
    • Job (job entry)
      • General
      • Options
        • Options tab
        • Logging tab
        • Argument tab
        • Parameters tab
    • Kafka Offset
      • General
      • Options
        • Setup tab
        • Options tab
        • Offset Settings tab
      • Examples
    • Modify Snowflake warehouse
      • General
      • Options
        • Database connection and warehouse
        • Warehouse settings
        • Cluster settings
        • Activity settings
    • Pentaho MapReduce
      • General
      • Options
        • Mapper tab
        • Combiner tab
        • Reducer tab
        • Job Setup tab
        • Cluster tab
          • Hadoop cluster configuration
        • User Defined tab
      • Use PDI outside and inside the Hadoop cluster
        • Pentaho MapReduce workflow
          • PDI Transformation
          • PDI Job
        • PDI Hadoop job workflow
        • Hadoop to PDI data type conversion
        • Hadoop Hive-specific SQL limitations
        • Big data tutorials
    • Spark Submit
      • Before you begin
      • Install and configure Spark client for PDI use
        • Spark version 2.x.x
      • General
      • Options
        • Files tab
          • Java or Scala
          • Python
        • Arguments tab
        • Options tab
      • Troubleshooting your configuration
        • Running a Spark job from a Windows machine
    • Sqoop Export
      • General
        • Quick Setup mode
        • Advanced Options mode
    • Sqoop Import
      • General (Sqoop Import)
        • Quick Setup mode
        • Advanced Options mode
    • Start Snowflake warehouse
      • General
      • Options
    • Stop Snowflake warehouse
      • General
      • Options
    • Transformation (job entry)
      • General
      • Options
        • Options tab
        • Logging tab
        • Arguments tab
        • Parameters tab
Powered by GitBook
On this page

Was this helpful?

Export as PDF
  1. Advanced topics
  2. Pentaho Data Services

Share a Pentaho Data Service with others

For information on how to share a Pentaho Data Service, see the following topics:

  • Share a Pentaho Data Service with others

  • Connect to the Pentaho Data Service from a Pentaho tool

  • Connect to the Pentaho Data Service from a Non-Pentaho tool

  • Query a Pentaho Data Service

PreviousPublish a Pentaho Data ServiceNextShare a Pentaho Data Service with others

Last updated 8 days ago

Was this helpful?

LogoLogo

About

  • Pentaho.com

Support

  • Pentaho Support

Resources

  • Privacy

© 2025 Hitachi Vantara LLC