Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-11590

RM process stuck after calling confStore.format() when ZK SSL/TLS is enabled, as netty thread waits indefinitely

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      YARN-11468 enabled Zookeeper SSL/TLS support for YARN.
      Curator uses ClientCnxnSocketNetty for secured connection and the thread needs to be closed after calling confStore.format() to avoid the netty thread waiting indefinitely, which renders the RM unresponsive after deleting the confstore when started with the "-format-conf-store" arg.

      The unclosed thread, which keeps RM running:

      2023-10-10 12:13:01,000 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: The Thread[main-SendThread(ferdelyi-1.ferdelyi.root.hwx.site:2182),5,main]TIMED_WAITING is stands at [sun.misc.Unsafe.park(Native Method), java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215), java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078), java.util.concurrent.LinkedBlockingDeque.pollFirst(LinkedBlockingDeque.java:522), java.util.concurrent.LinkedBlockingDeque.poll(LinkedBlockingDeque.java:684), org.apache.zookeeper.ClientCnxnSocketNetty.doTransport(ClientCnxnSocketNetty.java:275), org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1289)]
      

      Attachments

        Issue Links

          Activity

            People

              bender Ferenc Erdelyi
              bender Ferenc Erdelyi
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: