Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-11455

All RMs in HA are stuck in standby when the ZK connection is disconnected

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.10.1, 3.3.3
    • None
    • resourcemanager
    • None

    Description

      All RMs in HA are stuck in standby when the ZK connection held by the active RM is disconnected.

      2023-02-22 13:08:19,832 INFO org.apache.hadoop.ha.ActiveStandbyElector (main-EventThread): Session disconnected. Entering neutral mode...
      2023-02-22 13:08:19,832 WARN org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService (main-EventThread): Lost contact with Zookeeper. Transitioning to standby in 10000 ms if connection is not reestablished.

       

      Repro:

      Send a Disconnected Event to the Active RM using below code.

      zkConnectionState = ConnectionState.DISCONNECTED;
      enterNeutralMode();
      

       

      Attachments

        Activity

          People

            prabhujoseph Prabhu Joseph
            prabhujoseph Prabhu Joseph
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: