Manage the Pentaho Repository
You can use the Pentaho Repository to share files and folders across teams and products.
The Pentaho Repository is an environment for collaborative analysis and ETL (extract, transform, and load) development.
In this topic
Upload and download from the Pentaho Repository
You can upload and download from the Pentaho Repository with the Pentaho User Console (PUC) or the command line interface. The ability to upload and download assumes that you have already created a data source, that data content exists to be pushed, and defines permissions for the repository.
For uploading, any starting location can be selected. Permission settings are inherited through the folder structure if the destination location has existing permission settings. It is advisable to keep existing security settings as defaults for the upload.
For downloading, you are able to select the destination location for the downloaded file or folder. The download process always creates a ZIP file that includes a manifest file along with the downloaded content. The manifest file contains the collection of permissions settings for the downloaded files and folders and is found in the root directory of the ZIP file.
The following file types and artifacts are supported for uploading and downloading from the Pentaho Repository.
The following file types are hidden by default in the Pentaho Repository.
Reporting (
.prpt,.prpti;.xml)Analyzer (
.xanalyzer)Dashboards (
.xdash)Solution Files (
.xaction;.locale)
Web (
.html;.htm)Reporting (
.xml)Solution Files (
.properties)Graphics (
.png;.jpg,.gif;.svg)
Upload folders and files
The User Console can be used to upload files and folders to the Pentaho Repository. To upload files, a user must have a role with the Publish Content operation permission and Write permission for the target folder (Browse Files > Properties > Share > Users and Roles > Permissions), where permissions can be held through a user name or a role that the user holds (Power User for example). See the Pentaho Business Analytics document for details on publish and write permissions.
For Retain permission on upload file, the file permission contained in the uploaded ZIP (exportManifest.xml) will be the permission applied the repository. If the file doesn't have an entry in the exportManifest.xml for the permission, then it will use the default permission, which is inherit. This is equivalent to the command line switch: --permission=true
For Set Owner based on uploaded file, the owner found in the uploaded ZIP (exportManifest.xml) will be the owner of the file in the repository. If the file does not have an entry in the exportManifest.xml for the Owner, then it will set the Owner to the user who is uploading the zip. This equivalent to the command line switch: --retainOwnership=true
Complete the following steps to upload one or more files to the repository with the User Console.
From the User Console Home, click Browse Files.
The Browse Files page appears.
From the Browse pane on the left, click to choose the destination folder for the upload.
With the destination folder highlighted, click Upload in the Folder Actions pane on the right.
The Upload dialog box appears.
Browse to the files or zipped folders to be uploaded by clicking Browse.
Select one or more files or zipped folders to upload.
Click OK to begin upload using the default settings.
Choose preferences for the upload by clicking to expand the Advanced Options menu.
Choose Replace the Existing File or Do Not Upload from the first menu.
Choose File Permissions from the second menu.
The choices are Do Not Change Permissions or RetainPermissions on the Uploaded File.
If you selected Retain Permissions on the Uploaded File, choose File Ownership by selecting Do Not Change Owner or Set Owner Based on Uploaded File from the third menu.
Choose None, Short, or Verbose from the Logging menu.
Click OK.
The upload runs and the files or folders are uploaded to the repository. If the upload fails, an error log window opens with specific information.
Upload from the command line
Open the command line interface by clicking Start and typing
cmd. Press Enter.From the command line interface, go to the location where you have a local copy of the Pentaho Server installed, such as:
C:/dev/pentaho/pentaho-serverEnter a space, then type the arguments for upload into the command line interface. A completed upload argument would look something like this:
import-export.bat --import --url=http://localhost:8080/pentaho --username=dvader --password=password --charset=UTF-8 --path=/public --file-path=C:/Users/dvader/Downloads/pentaho-solutions.zip --overwrite=true --permission=true --retainOwnership=truePress Enter after the arguments are typed.
The upload process runs and the results are displayed in the command interface. If an argument is required for successful upload and has not been provided, the missing requirement is displayed in the command interface. The Command line arguments reference has a list of available command line arguments for uploading.
Download folders and files
Downloading folders and files can be done through the User Console or through the command line interface. The download process always creates a ZIP file that includes a manifest file along with the downloaded content. The manifest file is a collection of the permissions settings for the downloaded files and folders and is found in the root directory of the ZIP file.
Download action permissions
Only the Administrator role has downloading permissions, by default. The roles that have download action permissions are defined in the in the pentaho.xml configuration file. To add downloading permissions for a user, add that user to a role that has download permissions as shown below:
CAUTION:
Providing the download action permissions to non-admin users can expose sensitive data.
Stop the Pentaho Server.
Navigate to the
pentaho.xmlfile, located at:/server/pentaho-server/pentaho-solutions/system/pentaho.xmlOpen the file with any text editor and locate the
download-rolesnode in the file.Add additional roles as needed.
To create a Power User, type:
Note: Use a comma between roles; not spaces.
Save and close the
pentaho.xmlfile.Restart the Pentaho Server.
Restart the User Console. Log on as a user with that role.
You will now see the Download option in the File Actions pane in the Browse Files perspective of the User Console.
See the Install Pentaho Data Integration and Analytics document for instructions on starting and stopping the Pentaho Server.
Download a folder
From the User Console Home, click Browse Files.
The Browse Files page appears.
From the Browse pane on the left, browse to the location of the folder to be downloaded.
With the folder highlighted, click Download in the Folder Actions pane on the right.
Choose Save File in the window that appears, and click OK.
The folder is saved as a ZIP file with the manifest located in the top level of the file.
Download a file
From the User Console Home, click Browse Files.
The Browse Files page appears.
Browse to the location of the file by clicking through the folders in the Browse pane on the left.
The Files pane in the center populates with a list of reports.
Click to select the file in the Files pane and choose Download in the Folder Actions pane on the right.
Choose Save File in the window that appears, and click OK.
The file is saved as a ZIP file with the manifest located in the top level of the file.
Download from the command line
Open the command line interface by clicking Start and typing
cmd. Press Enter.From the command line interface, go to the location where you have a local copy of the Pentaho Server installed, such as:
C:/dev/pentaho/pentaho-serverEnter a space, then type the arguments for download into the command line interface
A completed download argument would look something like this:
import-export.bat --export --url=http://localhost:8080/pentaho --username=dvader --password=password --charset=UTF-8 --path=/public --file-path=C:/Users/dvader/Downloads/pentaho-solutions.zip --overwrite=true --permission=true --retainOwnership=truePress Enter after typing the arguments.
The download process runs and the results are displayed in the command interface. The file is saved as a ZIP file with the download manifest located in the top level of the file. If an argument is required for successful download and has not been provided, the missing requirement is displayed in the command interface. The Command line arguments reference has a list of available command line arguments for downloading.
Response code definitions
Here is a list of response codes for the import-export.bat script:
1
Publish to server failed.
2
General publish error.
3
Publish successful.
5
Authentication to the publish server failed. Username or password is incorrect.
6
Datasource publish failed.
7
XMLA catalog already exists.
8
Schema already exists.
9
Content about to be published already exists.
10
Error publishing to the server due to prohibited symbols in the name of the content.
Command line arguments reference
You can use the command line to manage the Pentaho Repository. The following tables list the command arguments, descriptions, values, and whether a specific argument is required.
Upload
The following arguments are for uploading to the Pentaho Repository:
-i, --import
Upload Command
n/a
Yes
-x, --source <arg>
External system type
legacy-db or file-system (default)
Yes
-o, --overwrite <arg>
Overwrites file(s) on upload. Default value is: True
Boolean
No
-m, --permission <arg>
Applies ACL using manifest file. Default value is: True
Boolean
No
-r, --retainOwnership
Replaces the file ownership upon upload with the ownership of the original download. Default value is: True
Boolean
No
-t, --type <arg>
The type of content being uploaded- files (default), metadata.
File type
No
Download
The following arguments are for downloading from the Pentaho Repository:
-e, --export
Download command
n/a
Yes
-fp, --filepath <arg>
Location that the ZIP file is downloaded to
File path
Yes
-w, --withManifest <arg>
If true, includes Manifest.xml inside ZIP. If false, download excludes this file.
Boolean
No
Backup and restore
The following arguments are for backing up or restoring the Pentaho Repository:
--backup
Backup command
n/a
Yes
--restore
Restore command
n/a
Yes
-u, --username <arg>
Pentaho Repository username
Alphanumeric
Yes
-p, --password <arg>
Pentaho Repository password
Alphanumeric
Yes
-fp, --filepath <arg>
Location that the ZIP file is downloaded to
File path
Yes
-o, --overwrite <arg>
Overwrites file(s) on upload. Default value is: True
Boolean
No
--logfile
Specifies the location for writing the log file.
File path
No
Common arguments
The following arguments apply to uploading, downloading, backing up and restoring the Pentaho Repository:
-c, --charset <arg>
Charset to use for the repository. Characters from external systems are converted to this charset.
UTF-8 (default)
No
-h, --help
Prints this message.
n/a
No
-f, --path <arg>
Pentaho Repository path to which the uploaded files are added (for example: /public)
File path
Yes
-p, --password <arg>
Pentaho Repository password
Alphanumeric
Yes
-u, --username <arg>
Pentaho Repository username
Alphanumeric
Yes
-l, --logfile <arg>
Path to local file system with name of file to write
File path
No
-a_ds, --analysis-datasource <arg>
Analysis datasource type.
Alphanumeric
No
-a_xmla, --xmla-enabled <arg>
Analysis XMLA enabled flag.
Boolean
No
-cat, --catalog <arg>
Catalog description.
Alphanumeric
No
-ds, --datasource-type <arg>
Datasource type.
Alphanumeric
No
-m_id, --metadata-domain-id <arg>
Metadata domain ID.
Alphanumeric
No
-params, --params <arg>
Parameters to pass to REST service call.
Alphanumeric
No
-res, --resource-type <arg>
Import/Export resource type.
Alphanumeric
No
-rest, --rest
Use the REST (default) version (not local to the Pentaho Server).
Alphanumeric
No
-v, --service <arg>
This is the REST Service call, for example: ACL, children, properties
URL
No
Import and export PDI content
You can import and export PDI content from and to a repository by using PDI's built-in functions, explained in these subsections.
Among other purposes, these procedures are useful for backing up and restoring content in the solution repository. However, users, roles, permissions, and schedules are not included in import/export operations. If you want to back up these items, you should follow the procedure in Backup and restore Pentaho repositories instead.
Note: If you are on Pentaho version 8.0 or earlier, as a best practice to avoid errors when exporting and importing repository contents, select specific content and not the entire repository. For more information, see Importing and exporting PDI content with Pentaho 8.0 and earlier.
Import content into a repository
Follow the instructions below to import the repository. You must already be logged into the repository in the PDI client before you perform this task.
In the PDI client, go to Tools > Repository > Import Repository.
Locate the export (XML) file that contains the solution repository contents.
Click Open.
The Directory Selection dialog box appears.
Select the directory in which you want to import the repository.
Click OK.
Enter a comment, if applicable.
Wait for the import process to complete.
Click Close.
The full contents of the repository are now in the directory you specified.
Import content from the command line
The import script is a command-line utility that pulls content into an enterprise or database repository from an individual .kjb or .ktr file, or from complete repository export XML files.
You must also declare a rules file that defines certain parameters for the data integration content you are importing. We provide a sample file called import-rules.xml, which is included with the standard Data Integration client tool distribution. It contains all the potential rules, along with comments that describe what each rule does. You can either modify the import-rules.xml file or copy its contents to another file, and then declare the rules file as a command-line parameter.
The table below defines command-line options for the import script, which are declared using the syntax specific to the operating system type:
Linux
Options are declared using a dash (-) followed by the option name, then an equals sign (=) and the value, where applicable. For example:
-option=valueWindows
Options are declared using a forward slash (/) followed by the option name, then a colon (:) and the value, where applicable. For example:
/option:value
Note: For options requiring no value entry (replace, coe, and version), a dash or slash (depending on your OS) followed by the option is the equivalent of selecting ‘Yes’; otherwise, the option is ignored.
rep
The name of the enterprise or database repository to import into.
user
The repository username you will use for authentication.
pass
The password for the username you specified with user.
dir
The directory in the repository that you want to copy the content to.
limitdir
Optional. A list of comma-separated source directories to include (excluding those directories not explicitly declared).
file
The path to the repository export file that you will import from.
rules
The path to the rules file, as explained above.
comment
The comment that will be set for the new revisions of the imported transformations and jobs.
replace
Replace existing transformations and jobs in the repository. (The default is: No)
coe
Continue on error, ignoring all validation errors. (The default is: No)
version
Show the version, revision, and build date of the PDI instance that the import script interfaces with. (The default is: No)
Linux
Import.sh -rep= Archive71 -user=admin -pass=password -coe -replace -dir=/home/admin -file= /Downloads/imagitasDemoEnclosure.ktr -rules=/Downloads/import-rules.xml -comment="New version upload from UAT"Windows
Import.bat /rep:Archive71 /user:admin /pass:password /coe /replace /dir:\home\admin /file:C:\Downloads\imagitasDemoEnclosure.ktr /rules:C:\Downloads/import-rules.xml /comment:"New version upload from UAT"
Export content from the repository
Follow the instructions below to export the repository. You must already be logged into the repository through the PDI client to complete this task.
In the PDI client, go to Tools > Repository > Export Repository.
In the Save As dialog box, browse to the location where you want to save the export file.
Type a name for your export file in the File Name text box.
Click Save.
The export file is created in the location you specified. This XML file is a concatenation of all of the data integration content you selected. It is possible to break it up into individual KTR and KJB files by hand or through a transformation.
Export content from the command line
To export repository objects into XML format, using command-line tools instead of exporting repository configurations from within Spoon, use named parameters and command-line options when calling Kitchen or Pan from a command-line prompt.
The following is an example command-line entry to execute an export job using Kitchen:
rep_folder
Repository Folder
rep_name
Repository Name
rep_password
Repository Password
rep_user
Repository Username
target_filename
Target Filename
It is also possible to use obfuscated passwords with Encr, the command line tool for encrypting strings for storage/use by PDI. The following is an example command-line entry to execute a complete command-line call for the export in addition to checking for errors:
Set PDI version control and comment tracking options
Pentaho Data Integration (PDI) can track versions and comments for jobs, transformations, and connection information when you save them. You can turn version control and comment tracking on or off by modifying their related statements in the repository.spring.properties text file.
Note: By default, version control and comment tracking are disabled (set to false).
Editing the version control statement
Exit from the PDI client (also called Spoon).
Stop the Pentaho Server.
See the Install Pentaho Data Integration and Analytics document for instructions on starting and stopping the Pentaho Server..
Open the
pentaho-server/pentaho-solution/system/repository.spring.propertiesfile in a text editor.To enable version control: Edit the versioningEnabled statement and set it to:
trueTo disable version control: Edit the versioningEnabled statement and set it to:
falseNote: If you disable version control, comment tracking is also disabled.
Save and close the file.
Start the Pentaho Server.
See the Install Pentaho Data Integration and Analytics document for instructions on starting and stopping the Pentaho Server..
Start the PDI client.
Verify that version control is set as you intended.
Verifying the version control option
Connect to the Pentaho Repository.
In the PDI client, click Tools > Explore.
In the Repository Explorer window, click on the Browse tab, then click on a file name.
Verify that version control is enabled or disabled:
Enabled
You can see the Access Control tab, and the Version History tab is visible.

Version History tab in the PDI client Disabled
You can see the Access Control tab, but the Version History tab is hidden.

Access Control tab in the PDI client
Editing the comment tracking statement
Exit from the PDI client (also called Spoon).
Stop the Pentaho Server.
Open the
pentaho-server/pentaho-solution/system/repository.spring.propertiesfile in a text editor.To enable comment tracking: Edit the versionCommentsEnabled statement and set it to
true.To disable comment tracking: If you want version control, but not comment tracking:
Edit the versioningEnabled statement and set it to
true.Edit the versionCommentsEnabled statement and set it to
false.
Save and close the file.
Start the Pentaho Server.
Start the PDI client.
Verify that Version Control and Comment Tracking are set as you intended.
Verifying the comment tracking option
Connect to the Pentaho Repository.
In the PDI client, click Tools > Explore.
In the Repository Explorer window, click on the Browse tab, then click on a file name.
Verify that comment tracking is enabled or disabled:
Enabled
The Version History tab appears with the Comments field. When you save a transformation, job, or connection information, you are prompted to enter a comment.

Version History tab showing the Comments column in the PDI client Disabled
The Version History tab appears and the Comment field is hidden. When you save a transformation, job, or connection information, you are no longer prompted to enter a comment.

Version History tab with the Comments column hidden in the PDI client
Purge transformations, jobs, and shared objects from the Pentaho Repository
The Purge Utility allows you to permanently delete shared objects (servers, clusters, and databases) stored in the Pentaho Repository as well as content (transformations and jobs). You can also delete revision information for content and shared objects.
CAUTION:
Purging is permanent. Purged items cannot be restored.
To use the Purge Utility, complete these steps.
Make sure the Pentaho Repository is running.
Open a shell tool, command prompt window, or terminal window, and navigate to the
pentaho/design-tools/data-integrationdirectory.At the prompt enter the purge utility command.
The format for the command, a table that describes each parameter, and parameter examples follow.
Note: The command must contain the url, user, and password parameters, as well as one of these parameters: versionCount, purgeBeforeDate, purgeFiles, or purgeRevisions.
Windows
purge-utility.bat [-url] [-user] [-password] [-purgeSharedObjects][-versionCount] [-purgeBeforeDate] [-purgeFiles] [-purgeRevisions] [-logFileName] [-logLevel]Linux
purge-utility.sh [-url] [-user] [-password] [-purgeSharedObjects] [-versionCount] [-purgeBeforeDate] [-purgeFiles] [-purgeRevisions] [-logFileName] [-logLevel]
OptionRequired?Description-url
Y
URL address for the Pentaho Repository. This is a required parameter. By default, the Pentaho Server is installed at this URL: http://localhost:8080/pentaho
-user
Y
Username for an account that can access the Pentaho Server as an administrator. This is a required parameter.
-password
Y
Password for the account used to access the Pentaho Server. This is a required parameter.
-purgeSharedObjects
N
When set to
TRUE, the parameter purges shared objects from the repository. This parameter must be used with the purgefile parameter. If you try to purge shared objects without including the purgefile parameter in the command line, an error occurs. If you set the purgeSharedObjects parameter toFALSE, it does not purge shared objects. If you include the purgeSharedObjects parameter in the command, but you don't set it toTRUEorFALSE, the Purge Utility will assume that it is set to TRUE.-versionCount
You must include only one of these: versionCount, purgeBeforeDate, purgeFiles, or purgeRevisions
Deletes entire version history except the for last versionCount versions. Set this value to an integer.
-purgeBeforeDate
Deletes all versions before purgeBeforeDate. The format for the date must be:
mm/dd/yyyy-purgeFiles
When set to
TRUE, transformations and jobs are permanently and physically removed. Shared objects (such as database connections) are NOT removed. If you want to also remove shared objects, include the purgeSharedObject parameter as well. If you set the purgeFiles parameter toFALSE, it does not purge files. If you include the purgeFiles parameter in the command, but you don't set it toTRUEorFALSE, the Purge Utility will assume that it is set to TRUE.-purgeRevisions
When set to
TRUE, all revisions are purged, but the current file remains unchanged. If you set the purgeRevisions parameter toFALSE, it does not purge revisions. If you include the purgeRevisions parameter in the command, but you do not set it toTRUEorFALSE, the Purge Utility will assume that it is set to TRUE.-logFileName
N
Allows you to specify the file name for the log file. If this parameter is not present, the log is written to a file that has this name format:
purge-utility-log-YYYYMMdd-HHmmss.txtYYYYMMdd-HHmmssindicates the date and time that the log file was created (e.g.,purge-utility-log-20140313-154741.txt).-logLevel
N
Indicates the types and levels of detail the logs should contain. Values are:
ALL,DEBUG,ERROR,FATAL,TRACE,INFO,OFF, andWARN. By default the log is set to INFO. Check the Log4J documentation for more details on the logging framework definitions: https://logging.apache.org/log4j/2.x/log4j-api/apidocs/org/apache/logging/log4j/Level.html.In this example, only the last five revisions of transformations and jobs are NOT deleted. All previous revisions are deleted.
purge-utility.bat -url=http://localhost:8080/pentaho -user=jdoe -password=mypassword -versionCount=5In the example that follows all revisions before
01/11/2009are deleted. Logging is set to the WARN level.purge-utility.bat -url=http://localhost:8080/pentaho -user=jdoe -password=mypassword -purgeBeforeDate=01/11/2009 -logLevel=WARNIn this example, all transformations, jobs, and shared objects are deleted. You do not need to set the purgeFiles and purgeSharedObjects parameters to
TRUEfor this command to work. Logging is turned OFF.purge-utility.bat -url=http://localhost:8080/pentaho -user=jdoe -password=mypassword -purgeFiles -purgeSharedObjects -logLevel=OFF
When finished, examine the logs to see if there were any issues or problems with the purge.
To see the results of the purge process, disconnect, then reconnect to the Pentaho Repository. In the Repository Explorer, in the Browse tab, verify that the items you specified in your purge utility command were purged.
Backup and restore Pentaho repositories
A complete backup and restore of your Pentaho repositories can be done through either the command (cmd) window or with the File Resource service in the Pentaho Rest APIs.
Note: If you are on Pentaho version 8.0 or earlier, as a best practice to avoid errors when exporting and importing repository contents, select specific content and not the entire repository. For more information, see Importing and exporting PDI content with Pentaho 8.0 and earlier.
The backup process exports all content from the Pentaho Repository and creates a ZIP file, which includes:
Users and roles
All files (dashboards, reports, etc.)
Schedules
Data connections
Mondrian schemas
Metadata entries
A manifest file
All of your content is pulled from this ZIP file when you restore the Pentaho Repository.
You must have appropriate administrator permissions on the server in order to perform a repository backup or restore.
Step 1: Backup the Pentaho Repository
Backing up your Pentaho Repository is done through the use of command line arguments. You can customize the provided examples for your server.
If an argument is required for successful backup and has not been provided, the missing requirement is displayed in the cmd window. Backup results are also displayed in the window.
Open a cmd window and point the directory to the install location of your running Pentaho Server.
Use the
import-exportscript with your arguments for backing up the repository.Press Enter.
For example,
Windows
import-export.bat --backup --url=http://localhost:8080/pentaho --username=admin --password=password --file-path=c:/home/Downloads/backup.zip --logfile=c:/temp/logfile.log --logLevel=DEBUGLinux
./import-export.sh --backup --url=http://localhost:8080/pentaho --username=admin --password=password --file-path=/home/Downloads/backup.zip --logfile=/temp/logfile.log
Step 2: Restore the Pentaho Repository
Restoring your Pentaho Repository is also done through the use of command line arguments. The process for restoring both repositories is similar to the backup process, except for the differences shown in the provided examples. These examples can be customized for your particular server.
If an argument is required for successful restore and has not been provided, the missing requirement is displayed in the cmd window. Restore results are also displayed in the window.
Open a cmd window and point the directory to the install location of your running Pentaho Server.
Use the
import-exportscript with your arguments for backing up the repository.Press Enter.
For example,
Windows
import-export.bat --restore --url=http://localhost:8080/pentaho --username=admin --password=password --file-path=c:/home/Downloads/backup.zip --overwrite=true --logfile=c:/temp/logfile.log --logLevel=DEBUGLinux
./import-export.sh --restore --url=http://localhost:8080/pentaho --username=admin --password=password --file-path=/home/Downloads/backup.zip --overwrite=true --logfile=/temp/logfile.log
Last updated
Was this helpful?

