Use Carte Clusters
Carte is a lightweight web server for running PDI transformations and jobs remotely.
It receives the transformation or job (as XML) plus the run configuration. It also exposes endpoints to monitor, start, and stop executions.
Carte clusters
Use a Carte cluster to distribute transformation processing across multiple Carte servers.
A cluster includes:
One master node that tracks execution.
Two or more slave nodes that do the work.
You can also run a single Carte instance as a standalone remote execution engine. Define one or more Carte servers in the PDI client (Spoon), then send jobs and transformations to them.
You can cluster Pentaho Server for failover. If you use Pentaho Server as the cluster master (dynamic cluster), enable the proxy trusting filter. See Schedule jobs to run on a remote Carte server.
Cluster types
Static cluster
Static clusters have a fixed schema.
You define the master and slave nodes at design time.
Static clusters fit smaller, stable environments.
Dynamic cluster
Dynamic clusters discover slave nodes at run time.
Slave nodes are registered with the master. PDI monitors slaves every 30 seconds to see if they are available.
Dynamic clusters fit cloud-like environments where nodes come and go.
Set up servers
Prerequisites
Copy required JDBC drivers and PDI plugins from your dev system to each Carte instance.
If you will run content from a Pentaho Repository, copy
repositories.xmlfrom your workstation’s.kettledirectory to the same location on each Carte server.
Set up a static cluster (start slave servers)
Start each slave server with the host and port you want to expose:
Verify each server is reachable from your PDI client.
(Optional) Create an init/startup script to start Carte on boot.
When Carte runs embedded in Pentaho Server, configuration is controlled by slave-server-config.xml under .../pentaho-solutions/system/kettle/. Stop Pentaho Server before editing that file.
Set up a dynamic cluster
Dynamic clusters use two configuration files:
carte-master-config.xmlfor the master.carte-slave-config.xmlfor each slave.
You can rename the files. Keep the required XML structure and values.
Configure a Carte master server
Copy required JDBC drivers and plugins to the master host.
Create
carte-master-config.xmlusing this template:The master
<name>must be unique in the cluster.Start Carte using the master config file:
Verify the master is running.
(Optional) Create an init/startup script for boot-time startup.
Configure Carte slave servers
Ensure the master is running.
Copy required JDBC drivers and plugins to each slave host.
Create
carte-slave-config.xmlusing this template:Each slave
<name>must be unique in the cluster.(Optional) To use the master’s Kettle properties on a slave, add these tags inside the slave’s
<slaveserver>:Start Carte using the slave config file:
If you use Pentaho Repository content, copy
repositories.xmlto each slave’s.kettledirectory.Restart the master and slave servers. Restart Pentaho Server if it participates.
Carte and PDI track object age for transformations and jobs. Objects are purged only when servers are idle. Purge verification runs every 20 seconds.
Configure schedule and remote execution log cleanup
These settings live in slave-server-config.xml.
Stop Pentaho Server before editing this file.
max_log_lines: Max log lines per execution. Use0for no limit.max_log_timeout_minutes: Remove log lines older than this value. Use0for no timeout.object_timeout_minutes: Remove execution entries older than this value. Use0for no timeout.
Example:
Security and advanced server settings
Configure Carte servers for SSL
Carte SSL uses the JKS keystore format.
Keep the keystore in a restricted-access directory. Carte runs on Jetty.
For Jetty SSL details, see: https://wiki.eclipse.org/Jetty/Howto/Configure_SSL.
Stop Carte.
Open
carte-master-config.xml.Add these values inside the master server
<slaveserver>:keyStore(required): Path to the keystore file.keyStorePassword(required): Keystore password.keyPassword(optional): Private key password. Omit if it matcheskeyStorePassword.
Example:
Use the
encrtool in thedata-integrationdirectory to obfuscate passwords:encr.bat -carte <password>(Windows) orencr.sh -carte <password>(Linux).Add the same
<sslConfig>block to eachcarte-slave-config.xml.Start Carte.
Access Carte over HTTPS:
Configure Carte servers for JAAS
You can use JAAS for user authentication.
Create a JAAS config file (example below) and save it as
carte-ldap.jaas.confon the Carte host:Set
debug="false"in production environments.Add these Java options to
Spoon.bat(Windows) orspoon.sh(Linux), updating the path:Start Carte. Verify the server does not prompt for BASIC authentication.
Change Jetty server parameters
Carte uses an embedded Jetty server.
Only change these settings if you need to tune connection handling.
acceptors: Threads dedicated to accepting connections. Keep it at or below CPU count.acceptQueueSize: Backlog size before the OS starts rejecting connections.lowResourcesMaxIdleTime: Close idle connections faster under high load.
Jetty docs:
Set Jetty parameters in a Carte config file
Add this block inside <slave_config> in carte-slave-config.xml:
Adjust values, then save the file.
Set Jetty parameters in kettle.properties
Set these variables to numeric values:
KETTLE_CARTE_JETTY_ACCEPTORSKETTLE_CARTE_JETTY_ACCEPT_QUEUE_SIZEKETTLE_CARTE_JETTY_RES_MAX_IDLE_TIME
Configure the PDI client
Initialize slave servers
Open a transformation.
In Explorer View, select the Slave tab.
Select New.
Enter the slave server connection details:
Server name
Hostname or IP address
Port (leave blank for port 80)
Web App Name (required only for Pentaho Server)
User name and password
Is the master
For clustered executions, define one master and the rest as slaves.
Select OK.
Create a cluster schema
In Explorer View, right-click Kettle cluster schemas, then select New.
Configure:
Schema name
Port: Starting port for slave step numbering.
Sockets buffer size
Sockets flush interval rows
Sockets data compressed?
Dynamic cluster: Enable if a master Carte server performs failover.
Slave Servers: Add one master and any number of slaves.
Run transformations in a cluster
Open the Run Options window (toolbar Run context menu or
F8).Select a run configuration that runs the transformation in clustered mode.
To run a clustered transformation from a job, open the Transformation job entry, then set Run this transformation in a clustered mode? on the Advanced tab.
To assign a cluster to a step, right-click the step, select Clusters, then pick a cluster schema.
When running clustered transformations, enable Show transformations to see the generated transformations that run on the cluster.
Schedule and run remotely
Schedule jobs to run on a remote Carte server
These changes are required to schedule a job to run on a remote Carte server.
They are also required if Pentaho Server acts as the load balancer in a dynamic Carte cluster.
Stop Pentaho Server and the remote Carte server.
Copy
repositories.xmlfrom your workstation’s.kettledirectory to the same location on the Carte host.Open
.../tomcat/webapps/pentaho/WEB-INF/web.xml.In the Proxy Trusting Filter section, add the Carte server IP to
TrustedIpAddrs.Uncomment the proxy trusting filter mappings between the
<!-- begin trust -->and<!-- end trust -->markers.Save
web.xml.Add
-Dpentaho.repository.client.attemptTrust=trueto the Carte startup script:Windows (
Carte.bat): add to theOPTline.Linux (
Carte.sh): add to theOPTvariable beforeexport OPT.
Start the Carte server and Pentaho Server.
Run transformations and jobs from a repository on the Carte server
Copy repositories.xml from the user’s .kettle directory to the Carte host’s $HOME/.kettle directory.
Carte also looks for repositories.xml in the directory where you started Carte.
Stop Carte
You can stop Carte from the command line or from a URL.
Stop from the CLI
Arguments:
Example:
Options:
-h, --help: Help text.-s, --stop: Stop the running Carte server.-u, --username <arg>: Admin user name.-p, --password <arg>: Admin password.
Stop from a URL
Last updated
Was this helpful?

