Pig job not executing after Kerberos authentication fails
Your Pig job will not execute after the Kerberos authentication fails until you restart PDI. While PDI may continue to generate new Kerberos tickets and other Hadoop components may work, Pig continues to fail until PDI is restarted.
For authentication with Pig, Pentaho uses the UserGroupInformation wrapper around a JAAS Subject with username and password which is used for impersonation. The UserGroupInformation wrapper is stored in the KMSClientProvider constructor. When the Kerberos ticket expires, a new UserGroupInformation is created, but the instance stored in the KMSClientProvider constructor does not update. The Pig job fails when Pig cannot obtain delegation tokens to authenticate the job at execution time.
To resolve, set the key.provider.cache.expiry time to a value equal to or less than the duration time of the Kerberos ticket. By default, the key.provider.cache.expiry time is set to a value of: 10 days
Note: This solution assumes you are using HortonWorks 3.0.
Navigate to the
hdfs-site.xmlfile location.In the PDI client, navigate to:
data-integration\plugins\pentaho-big-data-plugin\hadoop-configurations\hdp25For the Pentaho Server, navigate to:
pentaho-server\pentaho-solutions\system\kettle\plugins\pentaho-big-data-plugin\hadoop-configurations\hdp25
Open the
hdfs-site.xmlfile in a text editor.Adjust the key.provider.cache.expiry value (in milliseconds) so that it is less than the duration time of the Kerberos ticket.
Note: You can view the Kerberos ticket duration time in the
krb5.conffile.<property> <name>dfs.client.key.provider.cache.expiry</name> <value>410000</value> </property>
Last updated
Was this helpful?

