Install and configure a Remote Worker

The Remote Worker in Data Catalog gives a secure and scalable solution for metadata management, facilitating metadata extraction and task execution across distributed environments while complying with required network security requirements. For more information, see the Remote Worker section in the Use Pentaho Data Catalog.

Perform the following steps to install and configure a Remote Worker:

  • Ensure that Docker and Docker Compose are installed on the machine where the Remote Worker will be deployed.

  • Ensure you have the latest Data Catalog Docker deployment bundle pdc-10.2.5-images.tgz and Remote Worker bundle pdc-remote-10.2.5-compose.tgz. If you don't have, contact Pentaho support.

  1. Load the latest Data Catalog Docker deployment bundle pdc-10.2.5-images.tgz into Docker.

  2. Extract the Data Catalog Remote Worker bundle and navigate into the pdc-deployment directory:

    sudo tar -xvf pdc-remote-10.2.5-compose.tgz -C /opt/
    cd /opt/pentaho/pdc-docker-deployment/
    
  3. Create the conf directory:

    sudo mkdir conf
  4. Create a .env file, specify the necessary environment variables, and save the .env file:

    1. Copy the variables PDC_WS_REMOTE_JOB_SERVER_ID and PDC_DATA_ENCRYPTION_KEY along with their values from the base server installation.

    2. Add the PDC_WS_REMOTE_OPS_URL , PDC_WS_REMOTE_GLOSSARY_BASE_URL, and PDC_MONGODB_OPS_DATABASE_URL with the ops and mongodb details.

    GLOBAL_SERVER_HOST_NAME=<Base Server FQDN or IP Address>
    PDC_WS_REMOTE_JOB_SERVER_ID="eb710d72-9613-a978-42c5-a101343bf6ca"
    PDC_DATA_ENCRYPTION_KEY="2eindcVFPic6uA1o0wRWnXsBKNKiMMhbc2P9qTtvUTE="
    COMPOSE_PROFILES=ws-remote
    PDC_WS_REMOTE_OPS_URL="https://<Base Server FQDN or IP Address>/internal/ops/"
    PDC_WS_REMOTE_GLOSSARY_BASE_URL="https://<Base Server FQDN or IP Address>/glossary-service/api/v1/"
    PDC_MONGODB_OPS_DATABASE_URL="mongodb://root:broot@<Base Server FQDN or IP Address>:27017/ops?directConnection=true&authSource=admin&replicaSet=rs0"
    PDC_WS_REMOTE_DQ_CLIENT_ID=
    PDC_WS_REMOTE_DQ_API_URL=
    PDC_WS_REMOTE_DQ_CLIENT_SECRET=
    LOG_FLUENTBIT_ELASTICSEARCH_HOST=${GLOBAL_SERVER_HOST_NAME}
  5. Copy the extra-certs directory and its contents, like certificates, into the Remote Worker machine conf directory:

    cp -r extra-certs/ conf/
  6. Start the Remote Worker:

    sudo ./pdc.sh up
  7. Verify the Remote Worker deployment in Data Catalog:

    1. Log in to Data Catalog and click Management in the left navigation menu.

    2. In the Resources card, click Data Centers.

      You can see the Remote Worker listed with Affinity as Remote.

      Remote Worker in Data Catalog

You have successfully installed the Remote Worker and registered within Data Catalog.

Last updated

Was this helpful?