Jackrabbit hangs on unused data
Jackrabbit Repository (JCR) often retains a lot of unused data if you perform migrations from the same repository multiple times. This leads to an increase in table sizes and slowdowns on the repository.
You can clean up this unused data in the JCR by enabling a system listener designed for this purpose. Cleaning up the JCR can only be done with no users logged into it, and the repository remains locked while the process is running.
Stop the Pentaho Server.
Locate the
pentaho-server/pentaho-solutions/scheduler-plugin
directory and open theplugin.spring.xml
with any text editor.Add the following listener bean:
<bean id="repositoryCleanerSystemListener" class="org.pentaho.platform.plugin.services.repository.RepositoryCleanerSystemListener"> <property name="gcEnabled" value="true"/> <property name="execute" value="now"/> </bean>
Add a reference to the listener bean (
repositoryCleanerSystemListener
) as shown in the following list:<util:list id="schedulerLifecycleListenerList" list-class="java.util.ArrayList" value-type="org.pentaho.platform.api.engine.IPluginLifecycleListener"> <ref bean="embeddedQuartzSystemListener"/> <ref bean="embeddedVersionCheckSystemListener"/> <ref bean="emailGroupLifecycleListener"/> <ref bean="repositoryCleanerSystemListener"/> </util:list>
Save and close the
plugin.spring.xml
file and restart the Pentaho Server.
You can customize the settings for the repositoryCleanerSystemListener
by editing these properties. It is best practice to clean up the Jackrabbit repository on a regular schedule.
Property
Description
gcEnabled
A Boolean flag that turns the listener On (true
) or Off (false
).
execute
You can choose to run the listener:- now
runs once during server start-up
weekly
runs on the first day of each week (Sunday)
monthly
runs on the first day of each month
Last updated
Was this helpful?