Uploaded image for project: 'Oozie'
  1. Oozie
  2. OOZIE-3624

Oozie scheduled workflows fail when yarn/hdfs cluster changes

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 5.2.0
    • None
    • coordinator, workflow
    • None

    Description

      When the yarn cluster which is used by a Oozie scheduled workflow gets recreated with a new cluster, future runs of the scheduled workflow will break as they depend on the workflow/ job.properties files which was deployed on hdfs.

       

      The yarn jobtracker will also no longer work due to:

       

       

      Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): appattempt_1622844783178_0004_000002 not found in AMRMTokenSecretManager.
       
      

       

      It seem there are some tokens store in yarn and when the yarn cluster gets terminated and replaced with a new yarn cluster. The oozie launcher will hit this error message. This invalid token message also happen when I configure oozie to use a remote yarn cluster.

      The yarn cluster getting recreated is a common case in cloud, I'm wondering is there a way for oozie to be resilient to the underlying yarn cluster changing?

       

      Also is it supported for workflow/coordinator/ job.properties files to be deployed on s3 instead of hdfs?

       

       

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            rentaow Rentao Wu
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: