Create a transformation
Create a transformation to arrange a network of logical ETL tasks, called steps, into a data workflow.
To create a transformation in Pipeline Designer, complete the following steps:
Log into the Pentaho User Console.
Open Pipeline Designer:
If you are using the Modern Design, in the menu on the left side of the page, click Pipeline Designer.
If you are using the Classic Design, click Switch to the Modern Design, and then in the menu on the left side of the page, click Pipeline Designer.
Pipeline Designer opens with the Quick Access section expanded.
In the Transformation card, click Create Transformation. A new, blank transformation is opened with the Design pane selected.
Add steps to the transformation:
In the Design pane, search for or browse to each step you want to use in the transformation. You may need to expand sections in the in the Design pane to find steps.
Drag the steps you want to use onto the canvas.
Work with steps on the canvas by hovering over a step to open the step menu and selecting one of the following options:
Step optionDescriptionDelete
Deletes the step from the canvas.
Edit
Opens the Step Name window where you can configure the properties of the step. Step properties may appear in multiple sections, tabs, or both.
Note: To learn more about the step you're configuring, in the lower-left corner of the Step Name window, click Help.
Duplicate
Adds a copy of the step to the canvas.
More Actions > Change Number of Copies
Opens the Number of copies dialog box, where you can enter a number or a variable to specify how many copies of the step are processed in parallel when the transformation is run. To find a variable, in the Number of copies (1 or higher) box, click the Select variable to insert icon.
More Actions > Data Movement
Opens a list of data movement options for you to select from to specify how data rows are distributed to the next steps of the transformation. Round-Robin is the default setting.
Round-Robin: Distributes rows evenly across all parallel step copies using round-robin logic. This setting optimizes load balancing when the transformation includes multiple instances of the next step.
Load Balance: Routes rows to the step copy with the lightest processing load. This setting can improve performance when processing times vary across parallel step instances.
Copy Data to Next Steps: Sends each row to all parallel step copies. Use this setting when every downstream branch must process the complete dataset independently.
To add hops between steps, hover over a step’s handle until a plus sign (+) appears, then drag the connection to the handle of another step.
(Optional) To add a note on the canvas, in the canvas toolbar, click the Add Note icon, and then in Notes dialog box, enter your note and click Save.
Note: You can format the note in the Notes dialog box, by clicking Style, and then entering the font, color, and shadow options you want to use for the note.
Save the transformation:
Click Save. The Select File or Directory dialog box opens.
Search for or browse to the folder in the repository where you want to save the transformation.
(Optional) To create a new folder in the repository, click the New Folder icon, and then in the New folder dialog box, enter a New folder name and click Save.
(Optional) To delete a folder from the repository, select the folder and click the Delete icon.
In the Select File or Directory dialog box, click Save. The Save Change dialog box opens.
Click Yes to confirm that you want to save the transformation.
Last updated
Was this helpful?

