LogoLogo
CtrlK
Try Pentaho Data Integration and Analytics
10.2 Data Integration
  • Pentaho Documentation
  • Pentaho Data Integration
    • Starting the PDI client
    • Use the PDI client perspectives
    • Customize the PDI client
  • Use a Pentaho Repository in PDI
    • Create a connection in the PDI client
    • Connect to a Pentaho Repository
    • Manage repositories in the PDI client
    • Unsupported repositories
    • Use the Repository Explorer
      • Access the Repository Explorer window
      • Create a new folder in the repository
      • Open a folder, job, or transformation
      • Rename a folder, job, or transformation
      • Delete a folder, job, or transformation
      • Move objects
      • Restore objects
      • Use Pentaho Repository access control
      • Use version history
  • Scheduler perspective in the PDI client
    • Schedule a transformation or job
    • Edit a scheduled run of a transformation or job
    • Stop a schedule from running
    • Enable or disable a schedule from running
    • Delete a scheduled run of a transformation or job
    • Refresh the schedule list
  • Streaming analytics
    • Get started with streaming analytics in PDI
    • Data ingestion
    • Data processing
  • Data Integration perspective in the PDI client
    • Basic concepts of PDI
    • Work with transformations
      • Create a transformation
      • Open a transformation
      • Rename a folder
      • Save a transformation
      • Run your transformation
        • Run configurations
          • Select an Engine
        • Options
        • Parameters and Variables
        • Analyze your transformation results
      • Stop your transformation
      • Configure transformation properties
      • Use the Transformation menu
    • Work with jobs
      • Create a job
      • Open a job
      • Rename a folder
      • Save a job
      • Run your job
        • Run configurations
          • Pentaho engine
        • Options
        • Parameters and variables
      • Stop your job
      • Configure job properties
      • Use the Job menu
    • Add notes to transformations and jobs
      • Create a note
      • Edit a note
      • Reposition a note
      • Delete a note
    • Connecting to Virtual File Systems
      • Before you begin
        • Access to Google Cloud
        • Access to HCP REST
        • Access to Microsoft Azure
      • Create a VFS connection
      • Edit a VFS connection
      • Delete a VFS connection
      • Access files with a VFS connection
      • Pentaho address to a VFS connection
      • Create a VFS metastore
        • Enable a VFS metastore
        • Metastore configuration
      • Steps and entries supporting VFS connections
      • VFS browser
        • Before you begin
          • Access to a Google Drive
        • Access files with the VFS browser
        • Supported steps and entries
        • Configure VFS options
    • Logging and performance monitoring
      • Set up transformation logging
      • Set up job logging
      • Logging levels
      • Monitor performance
        • Sniff Test tool
        • Monitoring tab
        • Use performance graphs
      • PDI performance tuning tips
      • Logging best practices
    • Advanced topics
      • Understanding PDI data types and field metadata
        • Data type mappings
          • Using the correct data type for math operations
        • Using the fields table properties
          • Applying formatting
          • Applying calculations and rounding
        • Output type examples
      • PDI run modifiers
        • Arguments
        • Parameters
          • VFS properties
        • Variables
          • Environment variables
          • Kettle Variables
            • Set Kettle variables in the PDI client
            • Set Kettle variables manually
            • Set Kettle or Java environment variables in the Pentaho MapReduce job entry
            • Set the LAZY_REPOSITORY variable in the PDI client
          • Internal variables
      • Use checkpoints to restart jobs
      • Use the SQL Editor
      • Use the Database Explorer
      • Transactional databases and job rollback
        • Make a transformation database transactional
        • Make a job database transactional
      • Web services steps
  • Advanced Pentaho Data Integration topics
    • PDI and Hitachi Content Platform (HCP)
    • Hierarchical data
      • Hierarchical data path specifications
    • PDI and Snowflake
      • Snowflake job entries in PDI
    • Copybook steps in PDI
      • Copybook transformation steps in PDI
      • Metadata discovery
    • Work with the Streamlined Data Refinery
      • How does SDR work?
        • App Builder, CDE, and CTools
          • Get started with App Builder
          • Community Dashboard Editor and CTools
      • Install and configure the Streamlined Data Refinery
        • Installing and configuring the SDR sample
          • Install Pentaho software
          • Download and install the SDR sample
        • Configure KTR files for your environment
        • Clean up the All Requests Processed list
        • Install the Vertica JDBC driver
        • Use Hadoop with the SDR
        • App endpoints for SDR forms
        • App Builder and Community Dashboard Editor
          • Get started with App Builder
          • Community Dashboard Editor and CTools
      • Use the Streamlined Data Refinery
        • How to use the SDR sample form
          • Edit the Movie Ratings - SDR Sample form
        • Building blocks for the SDR
          • Use the Build Model job entry for SDR
            • Create a Build Model job entry
            • Select existing model options
            • Variables for Build Model job entry
          • Using the Annotate Stream step
            • Use the Annotate Stream step
              • Creating measures on stream fields
                • Create a measure on a stream field
              • Creating attributes
                • Create an attribute on a field
              • Creating link dimensions
                • Create a link dimension
                • Create a dimension key
            • Creating annotation groups
              • Create an annotation group for sharing with other users
              • Create an annotation group locally
            • Metadata injection support
          • Using the Shared Dimension step for SDR
            • Create a shared dimension
            • Create a dimension key in Shared Dimension step
            • Metadata injection support
          • Using the Publish Model job entry for SDR
            • Use the Publish Model job entry
    • Use Command Line Tools to Run Transformations and Jobs
      • Startup script options
      • Pan Options and Syntax
      • Pan Status Codes
      • Kitchen Options and Syntax
      • Kitchen Status Codes
      • Import KJB or KTR Files From a Zip Archive
      • Connect to a Repository with Command-Line Tools
      • Export Content from Repositories with Command-Line Tools
    • Using Pan and Kitchen with a Hadoop cluster
      • Using the PDI client
      • Using the Pentaho Server
    • Use Carte Clusters
      • About Carte Clusters
      • Set up a Carte cluster
        • Carte cluster configuration
          • Configure a static Carte cluster
          • Configure a Dynamic Carte Cluster
            • Configure a Carte Master Server
            • Configure Carte slave servers
            • Tuning Options
          • Configure Carte servers for SSL
          • Configure Carte servers for JAAS
          • Change Jetty Server Parameters
            • In the Carte Configuration file
            • In the Kettle Configuration file
        • Initialize Slave Servers
        • Create a cluster schema
        • Run transformations in a cluster
      • Schedule Jobs to Run on a Remote Carte Server
      • Stop Carte from the Command Line Interface or URL
      • Run Transformations and Jobs from the Repository on the Carte Server
    • Connecting to a Hadoop cluster with the PDI client
      • Audience and prerequisites
      • Using the pre-installed Apache Hadoop driver
      • Using the Apache Vanilla Hadoop driver
      • Install a driver for the PDI client
        • Configure CDP Public Cloud cluster with the PDI client
      • Adding a cluster connection
        • Import a cluster connection
        • Manually add a cluster connection
        • Add security to cluster connections
          • Specify Kerberos security
        • Test a cluster connection
      • Managing Hadoop cluster connections
        • Edit Hadoop cluster connections
        • Duplicate a Hadoop cluster connection
        • Delete a Hadoop cluster connection
      • Connect other Pentaho components to a cluster
    • Partitioning data
      • Get started
        • Partitioning during data processing
        • Understand repartitioning logic
        • Partitioning data over tables
      • Use partitioning
        • Use data swimlanes
        • Rules for partitioning
      • Partitioning clustered transformations
      • Learn more
    • Pentaho Data Services
      • Creating a regular or streaming Pentaho Data Service
        • Data service badge
      • Open or edit a Pentaho Data Service
      • Delete a Pentaho Data Service
      • Test a Pentaho Data Service
        • Run a basic test
        • Run a streaming optimization test
        • Run an optimization test
        • Examine test results
          • Pentaho Data Service SQL support reference and other development considerations
            • Supported SQL literals
            • Supported SQL clauses
            • Other development considerations
      • Optimize a Pentaho Data Service
        • Apply the service cache optimization
          • How the service cache optimization technique works
          • Adjust the cache duration
          • Disable the cache
          • Clear the cache
        • Apply a query pushdown optimization
          • How the query pushdown optimization technique works
          • Add the query pushdown parameter to the Table Input or MongoDB Input steps
          • Set up query pushdown parameter optimization
          • Disable the query pushdown optimization
        • Apply a parameter pushdown optimization
          • How the parameter pushdown optimization technique works
          • Add the parameter pushdown parameter to the step
          • Set up parameter pushdown optimization
        • Apply streaming optimization
          • How the streaming optimization technique works
          • Adjust the row or time limits
      • Publish a Pentaho Data Service
      • Share a Pentaho Data Service with others
        • Share a Pentaho Data Service with others
        • Connect to the Pentaho Data Service from a Pentaho tool
        • Connect to the Pentaho Data Service from a Non-Pentaho tool
          • Step 1: Download the Pentaho Data Service JDBC driver
            • Download using the PDI client
            • Download manually
          • Step 2: Install the Pentaho Data Service JDBC driver
          • Step 3: Create a connection from a non-Pentaho tool
        • Query a Pentaho Data Service
          • Example
      • Monitor a Pentaho Data Service
    • Data lineage
      • Sample use cases
      • Architecture
      • Setup
      • API
      • Steps and entries with custom data lineage analyzers
      • Contribute additional step and job entry analyzers to the Pentaho Metaverse
        • Examples
          • Create a new Maven project
          • Add dependencies
          • Create a class which implements IStepAnalyzer
          • Create the Blueprint configuration
          • Build and test your bundle
          • See it in action
        • Different types of step analyzers
          • Field manipulation
          • External resource
          • Connection-based external resource
          • Adding analyzers from existing PDI plug-ins (non-OSGi)
    • Use the Pentaho Marketplace to manage plugins
      • View installed plugins and versions
      • Install plugins
  • Troubleshooting possible data integration issues
    • Troubleshooting transformation steps and job entries
      • 'Missing plugins' error when a transformation or job is opened
      • Cannot execute or modify a transformation or job
      • Step is already on canvas error
    • Troubleshooting database connections
      • Unsupported databases
      • Database locks when reading and updating from a single table
      • Force PDI to use DATE instead of TIMESTAMP in Parameterized SQL queries
      • PDI does not recognize changes made to a table
    • Jobs scheduled on Pentaho Server cannot execute transformation on remote Carte server
    • Cannot run a job in a repository on a Carte instance from another job
    • Troubleshoot Pentaho data service issues
    • Kitchen and Pan cannot read files from a ZIP export
    • Using ODBC
    • Improving performance when writing multiple files
    • Snowflake timeout errors
    • Log table data is not deleted
  • PDI transformation steps
    • Abort
      • General
      • Options
      • Logging
    • Add a Checksum
      • Options
      • Example
      • Metadata injection support
    • Add sequence
      • General
      • Database generated sequence
      • PDI transformation counter generated sequence
    • AMQP Consumer
      • Before You begin
      • General
        • Create and save a new child transformation
      • Options
        • Setup tab
        • Create a new AMQP Message Queue
        • Use an existing AMQP Message Queue
          • Specify Routing Keys
          • Specify Headers
        • Security tab
        • Batch tab
        • Fields tab
        • Result Fields tab
      • Metadata injection support
      • See also
    • AMQP Producer
      • Before you begin
      • General
      • Options
        • Setup tab
        • Security tab
      • Metadata injection support
      • See also
    • Avro Input
      • General
      • Options
        • Source tab
          • Embedded schema
          • Separate schema
        • Avro Fields tab
        • Lookup Fields tab
          • Sample transformation walkthrough using the Lookup field
      • Metadata injection support
    • Avro Output
      • General
      • Options
        • Fields tab
        • Schema tab
        • Options tab
      • Metadata injection support
    • Calculator
      • General
      • Options
        • Calculator functions list
      • Troubleshooting the Calculator step
        • Length and precision
        • Data Types
        • Rounding method for the Round (A, B) function
    • Catalog Input
    • Catalog Output
    • Common Formats
      • Date formats
      • Number formats
    • Copybook Input
      • Before you begin
      • General
      • Options
        • Input tab
        • Output tab
        • Options tab
      • Use Error Handling
      • Metadata injection support
    • CouchDB Input
      • Options
      • Metadata injection support
    • CSV File Input
      • Options
      • Fields
      • Metadata injection support
    • Data types
    • Delete
      • General
      • The key(s) to look up the value(s) table
      • Metadata injection support
    • Discover metadata from a text file
      • General
      • Options
        • Input tab
        • Delimiter candidates tab
        • Enclosure candidates tab
        • Escape candidates tab
        • Delimiter and data type detection rules
      • Examples
      • Data lineage
      • Metadata injection support
    • Elasticsearch REST bulk insert
      • Before you begin
      • General
      • Options
        • General tab
        • Document tab
          • Creating a document to index with stream field data
          • Using an existing JSON document from a field
        • Output tab
    • ETL metadata injection
      • General
      • Options
        • Inject Metadata tab
          • Specify the source field
          • Injecting metadata into the ETL Metadata Injection step
        • Options tab
      • Example
        • Input data
        • Transformations
        • Results
      • Reference links
        • Articles
        • Video
      • Steps supporting metadata injection
    • Execute Row SQL Script
      • General
        • Output fields
      • Metadata injection support
    • Execute SQL Script
      • Notes
      • General
      • Options
        • Optional statistic fields
      • Example
      • Metadata injection support
    • Extract to Rows
      • Options
      • Fields
      • Example
    • File exists (Step)
    • Generate rows
      • Options
      • Fields table
    • Get records from stream
      • General
      • Options
      • Metadata injection support
      • See also
    • Get rows from result
      • General
      • Options
      • Metadata injection support
    • Get System Info
      • General
        • Data types
      • Metadata injection support
    • Google Analytics v4
      • General
      • Before you begin
      • Options
        • Connection tab
        • Date ranges tab
        • Fields tab
        • Filters tab
        • Options tab
    • Group By
      • General
        • The fields that make up the group table
        • Aggregates table
      • Examples
      • Metadata Injection Support
    • Hadoop File Input
      • General
      • Options
        • File tab
          • Accepting file names from a previous step
          • Show action buttons
          • Selecting a file using regular expressions
        • Open file
        • Content tab
        • Error Handling tab
        • Filters tab
        • Fields tab
          • Number formats
          • Scientific notation
          • Date formats
      • Metadata injection support
    • Hadoop File Output
      • General
      • Options
        • File tab
        • Content tab
        • Fields tab
      • Metadata injection support
    • HBase Input
      • General
      • Options
        • Configure query tab
          • Key fields table
        • Create/Edit mappings tab
          • Fields
          • Additional notes on data types
        • Filter result set tab
          • Fields
      • Namespaces
      • Performance considerations
      • Metadata injection support
    • HBase Output
      • General
      • Options
        • Configure connection tab
        • Create/Edit mappings tab
      • Performance considerations
      • Metadata injection support
    • HBase row decoder
      • General
      • Options
        • Configure fields tab
        • Create/Edit mappings tab
          • Key fields table
            • Additional notes on data types
      • Using HBase Row Decoder with Pentaho MapReduce
      • Metadata injection support
    • Hierarchical JSON Input
      • General
      • Options
      • Examples
    • Hierarchical JSON Output
      • General
      • Fields
    • Java filter
      • General
      • Options
      • Filter expression examples
    • JMS Consumer
      • Before you begin
      • General
        • Create and save a new child transformation
      • Options
        • Setup tab
        • Security tab
        • Batch tab
        • Fields tab
        • Result fields tab
      • Metadata injection support
      • See also
    • JMS Producer
      • Before you begin
      • General
      • JMS connection information
        • Setup tab
        • Security tab
        • Options tab
        • Properties tab
      • Metadata injection support
      • See also
    • Job Executor
      • Samples
      • General
      • Options
        • Parameters tab
        • Execution results tab
        • Row grouping tab
        • Results rows tab
        • Result files tab
    • JSON Input
      • General
      • Options
        • File tab
          • Selected files table
        • Content tab
        • Fields tab
          • Select fields
        • Additional output fields tab
      • Examples
      • Metadata injection support
    • Kafka consumer
      • General
        • Create and save a new child transformation
      • Options
        • Setup tab
        • Batch tab
        • Fields tab
        • Result fields tab
        • Options tab
        • Offset Settings tab
          • Modes
      • Security
        • Using SSL
        • Using SASL
        • Using SASL SSL
      • Metadata injection support
      • See also
    • Kafka Producer
      • General
      • Options
        • Setup tab
        • Options tab
      • Security
        • Using SSL
        • Using SASL
        • Using SASL SSL
      • Metadata injection support
      • See also
    • Kinesis consumer
      • General
        • Create and save a new child transformation
      • Options
        • Setup tab
        • Batch tab
        • Fields tab
        • Result fields tab
        • Options tab
      • Metadata injection support
      • See also
    • Kinesis Producer
      • General
      • Options
        • Setup tab
        • Options tab
      • Metadata injection support
      • See also
    • Mapping
      • General
        • Log lines in Kettle
      • Options
        • Parameters tab
        • Input tab
        • Add inputs to table
        • Output tab
      • Mapping Input Specification
        • Options
      • Mapping Output Specification
        • Options
      • Samples
    • MapReduce Input
      • Options
      • Metadata injection support
    • MapReduce Output
      • Options
      • Metadata injection support
    • Memory Group By
      • General
        • The fields that make up the Group Table
        • Aggregates table
      • Metadata injection support
    • Merge rows (diff)
      • General
      • Options
      • Examples
      • Metadata injection support
    • Microsoft Access input
      • Microsoft Access input
      • Options
        • File tab
        • Content tab
        • Fields tab
        • Additional output fields tab
      • Metadata injection support
    • Microsoft Excel Input
      • General
      • Options
        • Files tab
          • Selected files table
        • Sheets tab
        • Content tab
        • Error Handling tab
        • Fields tab
        • Additional output fields tab
      • Metadata injection support
    • Microsoft Excel Output
    • Microsoft Excel writer
      • General
      • Options
        • File & Sheet tab
          • File panel
          • Sheet panel
          • Template panel
        • Content Tab
          • Content options panel
          • When writing to existing sheet panel
          • Fields panel
      • Metadata injection support
    • Modified Java Script Value
      • General
      • Java script functions pane
      • Java Script pane
        • Script types
        • Fields table
        • Modify values
      • JavaScript Internal API Objects
      • Examples
        • Check for the existence of fields in a row
        • Add a new field in a row
        • Use NVL in JavaScript
        • Split fields
        • Comparing values
        • String values
        • Numeric values
        • Filter rows
      • Sample transformations
    • Modify values from a single row
      • General
      • Targets
      • Example
    • Modify values from grouped rows
      • General
      • Grouping fields
      • Modifications
      • Example
    • Mondrian Input
      • General
    • MongoDB Execute
      • General
      • Options
        • Main tab
        • Step tab
      • Execute commands
        • Database commands
        • Collection commands
      • Example of Execute step
      • Metadata injection support
    • MongoDB Input
      • General
      • Options
        • Configure connection tab
        • Input options tab
          • Tag set specification table
        • Query tab
        • Fields tab
      • Examples
        • Query expression
        • Aggregate pipeline
      • Metadata injection support
    • MongoDB Output
      • General
      • Options
        • Configure connection tab
        • Output options tab
        • Mongo document fields tab
          • Example
            • Input data
            • Document field definitions
            • Document structure
        • Create/drop indexes tab
          • Create/drop indexes example
      • Metadata injection support
    • MQTT Consumer
      • General
        • Create and save a new child transformation
      • Options
        • Setup tab
        • Security tab
        • Batch Tab
        • Fields tab
        • Result fields tab
        • Options tab
      • Metadata injection support
      • See also
    • MQTT Producer
      • General
      • Options
        • Setup tab
        • Security tab
        • Options tab
      • Metadata injection support
      • See also
    • ORC Input
      • Options
        • Fields
          • ORC types
      • Metadata injection support
    • ORC Output
      • General
      • Options
        • Fields tab
          • ORC types
        • Options tab
      • Metadata injection support
    • Parquet Input
      • General
        • Fields
          • PDI types
      • Metadata injection support
    • Parquet Output
      • General
      • Options
        • Fields tab
        • Options tab
      • Metadata injection support
    • Pentaho Reporting Output
      • General
      • Metadata injection support
    • Python Executor
      • Before you begin
      • General
      • Options
        • Script tab
          • Source panel
        • Input tab
          • Row by row processing
          • All rows processing
          • Mapping data types from PDI to Python
        • Output tab
          • Variable to fields processing
          • Frames to fields processing
          • Mapping data types from Python to PDI
    • Query HCP
      • Before you begin
      • General
      • Options
        • Query tab
        • Output tab
      • See also
    • Query metadata from a database
      • General
      • Options
        • Connection tab
        • Input tab
        • Fields tab
      • Metadata injection support
    • Read Metadata
    • Read metadata from Copybook
      • General
      • Example
      • Metadata injection support
    • Read metadata from HCP
      • General
      • Options
      • See also
    • Regex Evaluation
      • General
        • Capture Group Fields table
      • Options
        • Settings tab
          • Regular expression evaluation window
        • Content tab
      • Examples
    • Replace in String
      • General
      • Fields string table
      • Example: Using regular expression group references
      • Metadata injection support
      • See also
    • REST client step
      • General
      • Options
        • General tab
        • Authentication tab
        • SSL tab
        • Headers tab
        • Parameters tab
        • Matrix Parameters tab
    • Row Denormaliser
      • General
        • Group field table
        • Target fields table
      • Examples
      • Metadata injection support
    • Row Flattener
      • General
      • Example
    • Row Normaliser
      • General
        • Fields table
      • Examples
      • Metadata injection support
    • S3 CSV Input
      • Options
        • Fields
      • AWS credentials
      • Metadata injection support
      • See also
    • S3 File Output
      • Big Data warning
      • General
      • Options
        • File tab
        • Content tab
        • Fields tab
      • AWS credentials
      • Metadata injection support
      • See also
    • Salesforce bulk operation
      • General
      • Options
        • Connection tab
        • Operation tab
        • Fields tab
        • Advanced tab
      • Metadata Injection Support
    • Salesforce Delete
      • General
      • Options
        • Connection
        • Settings
    • Salesforce Input
      • General
      • Options
        • Settings tab
          • Connection
          • Settings
        • Content tab
          • Advanced
          • Additional fields
          • Other Fields
        • Fields tab
      • Metadata injection support
    • Salesforce Insert
      • General
      • Options
        • Connection
        • Settings
        • Output Fields
        • Fields
    • Salesforce Update
      • General
      • Options
        • Connection
        • Settings
        • Fields
    • Salesforce Upsert
      • General
      • Options
        • Connection
        • Settings
        • Output Fields
        • Fields
    • Select Values
      • General
      • Options
        • Select & Alter tab
        • Edit Mapping
        • Remove tab
        • Meta-data tab
      • Examples
      • Metadata injection support
    • Set Field Value
      • General
      • Options
      • Metadata injection support
    • Set Field Value to a Constant
      • General
      • Options
      • Metadata Injection Support
    • Simple Mapping (sub-transformation)
      • General
        • Log lines in Kettle
      • Options
        • Parameters tab
        • Input tab
        • Output tab
    • Single Threader
      • General
      • Options
        • Options tab
        • Parameters tab
    • Sort rows
      • General
        • Options
        • Fields column settings
      • Metadata injection support
    • Split Fields
      • General
      • Fields table
      • Example
      • Metadata injection support
    • Splunk Input
      • Prerequisites
      • General
      • Options
        • Connection tab
        • Fields tab
      • Raw field parsing
      • Date handling
      • Metadata injection support
    • Splunk Output
      • Prerequisites
      • General
      • Options
        • Connection tab
        • Event tab
      • Metadata injection support
    • String Operations
      • General
      • The fields to process
      • Metadata injection support
    • Strings cut
      • General
      • The fields to cut
      • Example
      • Metadata injection support
    • Switch-Case
      • Options
      • Example
      • Metadata injection support
    • Table Input
      • General
      • Options
      • Example
      • Metadata injection support
    • Table Output
      • General
      • Options
        • Main options tab
        • Database fields tab
        • Enter Mapping window
      • Metadata injection support
    • Text File Input
      • General
      • Options
        • File tab
          • Regular expressions
          • Selected files table
          • Accept file names
          • Show action buttons
        • Content tab
        • Error Handling tab
        • Filters tab
        • Fields tab
        • Additional output fields tab
      • Metadata injection support
    • Text File Output
      • General
      • Options
        • File tab
        • Content tab
        • Fields tab
      • Metadata injection support
      • See also
    • Transformation Executor
      • Error handling and parent transformation logging notes
      • Samples
      • General
      • Options
        • Parameters tab
          • Order of processing
        • Execution results tab
        • Row grouping tab
        • Result rows tab
        • Result files tab
    • Unique Rows
      • Prerequisites
      • General
        • Settings
      • See also
    • Unique Rows (HashSet)
      • General
        • Settings
      • See also
    • User Defined Java Class
      • Not complete Java
      • General (User Defined Java Class)
        • Class Code (User Defined Java Class)
          • Process rows
          • Error handling
          • Logging
        • Class and code fragments
      • Options
        • Fields tab
        • Parameters tab
        • Info steps tab
        • Target steps tab
      • Examples
      • Metadata injection support
    • Write Metadata
    • Write metadata to HCP
      • General
      • Options
      • See also
    • XML Input Stream (StAX)
      • Samples
      • Options
      • Element blocks example
    • XML Output
      • General
      • Options
        • File tab
        • Content tab
        • Fields tab
      • Metadata injection support
  • PDI job entries
    • Amazon EMR Job Executor
      • Before you begin
      • General
      • Options
        • EMR settings tab
          • AWS connection
          • Cluster
        • Job settings tab
    • Amazon Hive Job Executor
      • Before you begin
      • General
      • Options
        • Hive settings tab
          • AWS connection
          • Cluster
        • Job settings tab
    • Bulk load into Amazon Redshift
      • Before you begin
      • General
      • Options
        • Input tab
        • Output tab
        • Options tab
        • Parameters tab
    • Bulk load into Azure SQL DB
      • Before you begin
      • General
      • Options
        • Input tab
        • Output tab
        • Options tab
        • Advanced options tab
    • Bulk load into Databricks
      • General
      • Options
        • Input tab
        • Output tab
    • Bulk load into Snowflake
      • Before you begin
      • General
      • Options
        • Input tab
        • Output tab
        • Options tab
        • Advanced options tab
    • Create Snowflake warehouse
      • General
      • Options
        • Database connection and warehouse
        • Warehouse settings
        • Cluster settings
        • Activity settings
    • Delete Snowflake warehouse
      • General
      • Options
    • File Exists (Job Entry)
    • Google BigQuery loader
      • Before you begin
      • General
      • Options
        • Setup tab
        • File tab
    • Hadoop Copy Files
      • General
      • Options
        • Files/Folders tab
        • Settings tab
    • Job (job entry)
      • General
      • Options
        • Options tab
        • Logging tab
        • Argument tab
        • Parameters tab
    • Kafka Offset
      • General
      • Options
        • Setup tab
        • Options tab
        • Offset Settings tab
      • Examples
    • Modify Snowflake warehouse
      • General
      • Options
        • Database connection and warehouse
        • Warehouse settings
        • Cluster settings
        • Activity settings
    • Pentaho MapReduce
      • General
      • Options
        • Mapper tab
        • Combiner tab
        • Reducer tab
        • Job Setup tab
        • Cluster tab
          • Hadoop cluster configuration
        • User Defined tab
      • Use PDI outside and inside the Hadoop cluster
        • Pentaho MapReduce workflow
          • PDI Transformation
          • PDI Job
        • PDI Hadoop job workflow
        • Hadoop to PDI data type conversion
        • Hadoop Hive-specific SQL limitations
        • Big data tutorials
    • Spark Submit
      • Before you begin
      • Install and configure Spark client for PDI use
        • Spark version 2.x.x
      • General
      • Options
        • Files tab
          • Java or Scala
          • Python
        • Arguments tab
        • Options tab
      • Troubleshooting your configuration
        • Running a Spark job from a Windows machine
    • Sqoop Export
      • General
        • Quick Setup mode
        • Advanced Options mode
    • Sqoop Import
      • General (Sqoop Import)
        • Quick Setup mode
        • Advanced Options mode
    • Start Snowflake warehouse
      • General
      • Options
    • Stop Snowflake warehouse
      • General
      • Options
    • Transformation (job entry)
      • General
      • Options
        • Options tab
        • Logging tab
        • Arguments tab
        • Parameters tab
Powered by GitBook
On this page

Was this helpful?

Export as PDF
  1. PDI transformation steps
  2. S3 File Output

See also

S3 CSV Input

PreviousMetadata injection supportNextSalesforce bulk operation

Last updated 24 days ago

Was this helpful?

LogoLogo

About

  • Pentaho.com

Support

  • Pentaho Support

Resources

  • Privacy

© 2025 Hitachi Vantara LLC