All rows processing

Select the All Rows option to process all your data at once, for example, using the Python list of dictionaries. This selection is also commonly used for a pandas DataFrame, which contains a broad set of data that does not have to be joined ahead of time.

CAUTION:

When you select All rows, the input data will accumulate in the Python Executor step until all data is present. For large data sets (GB or greater), this accumulation of data may impact the amount of memory required by PDI. If you have concerns, please consult with your Infrastructure Engineer or System Administrator for guidance on your specific use cases.

Input tab in Python Executor

You can select variable names, indicate the input step in the transformations, and specify the data structure.

Option

Description

Available variables

Use the Plus Sign button to add a Python variable to the input mapping for the script used in the transformation. You can remove the Python variable by clicking the X icon.

Variable name

Enter the name of the Python variable. The list of Available variables will automatically update.

Step

Specify the name of the input step to map from. It can be any step in the parent transformation with an outgoing hop connected to the Python Executor step.

Data structure

Specify the data structure from which you want to pull the fields for mapping. You can select one of the following:- Pandas dataframe

The tabular data structure for Python/Pandas.

The table of values, all of the same type, which is indexed by a tuple of positive integers.

Each row in the PDI stream becomes a Python dictionary. All of the dictionaries are put into a Python list.

The **Mapping** table contains the following field properties.

Field Property
Description

Data structure field

The value of the Python data structure field to which you want to map the PDI field.

Data structure type

The value of the data structure type assigned to the data structure field to which you want to map the PDI field. For detailed information on data types, see Mapping data types from PDI to Python.

PDI field

The name of the PDI field which contains the vector data stored in the mapped Python variable.

PDI data type

The value of the data type assigned to the PDI field, such as a date, a number, or a timestamp.

Select the Get fields button to populate the table with fields from the input step(s) in your transformation. If necessary, you can modify your selections.

Last updated

Was this helpful?