Organizing ETL with projects
Use projects in Pentaho Data Integration (PDI) to store ETL workflow files and configurations in one place, including transformations, jobs, connections, and supporting content. Projects make ETL workflows easier to organize, share, and move between local systems, the Pentaho Repository, and virtual file systems (VFS). Regardless of PDI client (also called Spoon) configurations or project path variations, all users can consistently access the same files and settings for ETL workflows that are saved as a project. Additionally, because projects are isolated, you can safely reference transformations and jobs from other projects without creating configuration conflicts.
What is a project?
A project is a self-contained folder that holds everything unique to a specific ETL workflow, which can include one or more of the following items:
Transformation files (
.ktr)Job files (
.kjb)Variables
Configurations such as database connections, VFS connections, and run configurations
Data files used as input, or created as output, for transformations
Scripts or auxiliary files referenced by jobs or transformations
Documentation or notes relevant to the project

1
Use the Projects list to switch between recently opened projects. You can also select <No Project> to close the currently open project.
2
Use Project Files in the View tab to manage every file, folder, and subfolder for a project. You can create, rename, delete, or move files and folders, directly from this list.
3
Use Configurations in the View tab to view and manage all of the configurations for the project, such as database connections, VFS connections, and run configurations. Notice that configurations appear shaded when another configuration with the same name and higher priority is active.
Benefits of projects
When you use projects to manage your ETL workflows, you gain the following benefits and capabilities:
Manageability
Improved organization: Consolidate files and configurations for the ETL workflow into a single logical container (the project folder) and use subfolders to organize your workflow.
Scalability: Use multiple projects to segment and modularize integration work, making large ETL workflows easier to manage.
Visibility: In the View tab, see Project Files and Configurations for better visibility and improved management of your project files, connections, schemas, and run configurations.
Quick switching: Switch between all your open projects while you continue to work in the same location or between up to 10 recently opened projects when you change locations in the PDI client.
Portability: Move a project folder (including all files and configurations) without losing the context for the project. You can transfer the project between your local system, a Pentaho Repository, or a VFS.
Flexible collaboration
Ease of sharing: Share a project folder with other users and teams that need to review, contribute, or deploy the project. You can share a project folder as a ZIP file, upload it to a Pentaho Repository, save it in a shared VFS, or check it into a version control system.
Version control: Commit the project folder to version control so you can track changes and preserve shared history for the project.
Configuration management and security
Isolated configurations: Reduce conflicts by isolating ETL workflows in separate projects with their own unique configurations, connection parameters, and transformation strategies.
Enhanced security: Isolate configurations at the project level to help protect sensitive information (for example, database credentials).
Configuration prioritization: Define configurations at the appropriate level. When a configuration exists in multiple places, the PDI client uses the highest-priority version.
Backward compatibility: Continue working with existing ETL workflow configurations until you are ready to migrate to projects. Projects are an optional feature. You can even execute transformations and jobs with a project context even if the target file lives outside the project.
Configuration levels
Configurations are automatically applied in priority order to ensure the correct settings are used and to prevent unexpected conflicts. When a configuration exists in multiple locations, the PDI client applies the highest-priority version, reducing duplication and maintaining consistency for shared settings.
When configurations share the same name, the configuration at the highest level is applied. Configurations include connections, schemas, run configurations, slave servers, variables, and parameters.
Configurations are applied in the following priority order, from highest to lowest:
Default: A default configuration that cannot be edited or removed. Example: The Pentaho local run configuration.
Project: A configuration defined in the project that applies only to that project. Project-level configurations are only shown if you have an open project.
System: A configuration defined for projects in your local system. System-level configurations are not accessible to projects in a repository.
Repository: A configuration defined for projects in the connected repository. Repository-level configurations are not accessible to projects in your local system.
(Deprecated) File: File-level configurations have the lowest priority and can no longer be created. File-level configurations can only be edited or deleted. If you move a file with a file-level configuration into the Pentaho Repository, and the repository contains a configuration with the same name, a warning appears to indicate that the file configuration will overwrite the repository configuration.
Creating projects
You can create a project folder in any of the following locations:
Local file system: A folder on your computer.
Pentaho Repository: A folder in a connected Pentaho Repository.
VFS: A folder in a supported VFS* (for example, cloud storage) that the PDI client can access.
Create a project
Create a new project when you want to work with all the files and settings for an ETL workflow in a self-contained project folder that you can port to other locations and share with other users.
To create a new project, complete the following steps:
Open the PDI client.
Click Project > New Project. The New project dialog box opens.
Enter a Name for the project.
To navigate to the Project Path, click Browse. The Select a Folder dialog box opens.
Select a folder for your project by taking one of the following actions:
For an existing ETL workflow that you do not need to move, navigate to the root directory that contains every item in your workflow and select it.
To create a new project for a new ETL workflow, navigate to the folder where you want to create the project, click the Add folder icon, and in the Folder Name dialog box enter a New Folder Name and click OK.
Click Open.
In the New project dialog box, enter a Description for the project.
Click OK. The project is created and opened in the PDI client.
Create folders in a project
Create folders within your project to organize transformations, jobs, configurations, and related content, making it easier to manage the project as it becomes more complex.
Create a folder in your project by completing the following steps:
Open the PDI client.
Click Project > Open Project. The Select a Folder window opens.
Search or browse to the project folder and then click Open. The project opens.
Click the View tab.
Expand Project Files and then navigate to the existing folder in which you want to create your new folder.
Right-click the existing folder and select New Folder.
In the Folder Name dialog box, enter a folder name.
Click OK. The folder is created.
Share a project
Use one of the following options to share your project with other users:
Create the project in the Pentaho Repository or a shared VFS. See Create projects.
Move an existing project into a Pentaho Repository or shared VFS. See Move a project.
Check the project folder into your organization’s version control system.
Outside of the PDI client, save the project folder as a ZIP file and share it directly with another user. Projects shared as ZIP files can be opened in the recipient's chosen location. After the recipient saves the project to their chosen location, they can open it and then refresh the project by clicking Project > Refresh Project.
Editing projects
You can edit a project's name or description, rename or delete folders in the project folder, or move the entire project folder.
Edit project details
Edit a project details by completing the following steps:
Open the PDI client.
Click Project > Open Project. The Select a Folder window opens.
Search or browse to the project folder and then click Open. The project opens.
Click Project > Project Details. The Project Details window opens.
Update one or more of the following options:
Name
Description
Click OK. The details for the project are updated.
Edit a folder in a project
You can edit a folder in a project to change its name or delete it.
Edit a folder in a project by completing the following steps:
Open the PDI client.
Click Project > Open Project. The Select a Folder window opens.
Search or browse to the project folder and then click Open. The project opens.
Edit the folder by taking one of the following actions:
Rename a folder
Click the View tab.
Expand Project Files and then navigate to the folder you want to rename.
Right-click the folder and select Rename.
In the Rename dialog box, enter a new name and click OK.
Delete a folder
Click the View tab.
Expand Project Files and then navigate to the folder you want to delete.
Right-click the folder and select Delete.
Click Yes to confirm the deletion.
Move a project
Important:
If you move a project, you must move the entire project and its contents.
If you move individual files out of a project, such as transformation (.ktr) or job (.kjb) files, any references to other project items, such as database connections, will break.
Before moving a project, close it in the PDI client.
Projects can be moved between any of the following locations:
Local file system: A folder on your computer.
Pentaho Repository: A folder in a connected Pentaho Repository.
VFS: A folder in a supported VFS (for example, cloud storage) that the PDI client can access.
To move a project folder in your local system, use your operating system’s file explorer or command line.
To move a project folder into or out of the Pentaho Repository, see one of the following topics: Use the Repository Explorer or Upload and download from the Pentaho Repository.
To move a project into or out of a VFS, use your VFS management tool or file system interface.
After you have moved a project to its new location, open the project and refresh it.
Working with projects
You can switch between projects, refresh project data when it is modified outside of the PDI client, close a project, or migrate your existing ETL workflows into a project.
Switch between projects
You can quickly switch between open or recently opened projects in the PDI client.
To switch projects, above the View and Design tabs, click the Projects list and select the project that you want to switch to.
How the Projects list works
If you work in the same location, on your local system or in a connected repository, you can switch between all open projects.
If you restart the PDI client or connect to or disconnect from the Pentaho Repository, the 10 most recently opened projects remain in the Projects list for each location. The PDI client keeps separate lists of the 10 most recently opened projects for the local system and each user in the Pentaho Repository.
If a project is open, selecting another project closes the current project and opens the new one. Selecting <No Project> closes the current project.
If projects have the same name in the Projects list, each name shows the full path to help you tell them apart.
If a project is moved, deleted, or on an inaccessible VFS, it does not appear in the Projects list.
If a project is not in the Projects list, you must re-open it. See Open a project.
Refresh project data
Refresh an open project when files are added, removed, or modified outside of the PDI client.
To refresh a project, click Project > Refresh Project.
To refresh configurations that were edited outside of the PDI client, click the Refresh icon.
Close a project
With a project open, click Project > Close Project.
A warning is displayed if any open files in the project have unsaved changes.
As long as a project remains in the Projects list, its files are automatically restored the next time you open it.
Project variables
Project variables are a way to store small pieces of information dynamically in a narrower scope than Kettle or environment variables. Project variables are loaded either when a project is opened or when a transformation or job in the project is opened in the PDI client or run in a scheduled task. Use project variables to define variables that apply only to the specific project.
Important: When variables are defined in multiple places, the last one applied overrides the others.
Variables are applied in the following order:
Environment variables (JVM
-Darguments from the JVM process)Variables defined in
kettle.propertiesVariables defined in
project.propertiesVariables and parameters defined directly in the PDI client or in a scheduled job.
Variables set at runtime by steps such as
Set VariablesandSet Session Variables.
Set project variables on your local system
To edit project variables for projects saved on your local system, complete the following steps:
Open the
project.propertiesfile in a text editor. By default, theproject.propertiesfile is stored in the project folder, inside the.configfolder.Edit the file and save it.
(Optional) If the project you set the variables for is open in the PDI client, you must close the project and reopen it for the new variable values to be applied.
Set project variables in the Pentaho Repository
To edit project variables for projects saved in a Pentaho Repository:
Download the
project.propertiesfile from the project folder in the repository.Open the
project.propertiesfile in a text editor.Edit the file and save it.
Upload the edited
project.propertiesfile to the repository folder you downloaded it from.
Project run modifier
You can use -project argument when executing a transformation or job to specify the path for a project. The path can be to a folder in the Pentaho Repository, in a VFS, or on your local system.
Last updated
Was this helpful?

