Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-5099

ZK event thread waiting for root region assignment may block server shutdown handler for the region sever the root region was on

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.92.0, 0.94.0
    • 0.92.0, 0.94.0
    • None
    • None
    • Reviewed

    Description

      A RS died. The ServerShutdownHandler kicked in and started the logspliting. SpliLogManager
      installed the tasks asynchronously, then started to wait for them to complete.

      The task znodes were not created actually. The requests were just queued.
      At this time, the zookeeper connection expired. HMaster tried to recover the expired ZK session.
      During the recovery, a new zookeeper connection was created. However, this master became the
      new master again. It tried to assign root and meta.

      Because the dead RS got the old root region, the master needs to wait for the log splitting to complete.
      This waiting holds the zookeeper event thread. So the async create split task is never retried since
      there is only one event thread, which is waiting for the root region assigned.

      Attachments

        1. distributed-log-splitting-hangs.png
          42 kB
          Jimmy Xiang
        2. ZK-event-thread-waiting-for-root.png
          35 kB
          Jimmy Xiang
        3. hbase-5099.patch
          6 kB
          Jimmy Xiang
        4. hbase-5099-v2.patch
          6 kB
          Jimmy Xiang
        5. hbase-5099-v3.patch
          7 kB
          Jimmy Xiang
        6. hbase-5099-v4.patch
          7 kB
          Jimmy Xiang
        7. hbase-5099-v5.patch
          7 kB
          Jimmy Xiang
        8. hbase-5099-v6.patch
          7 kB
          Jimmy Xiang
        9. 5099.92
          9 kB
          Ted Yu

        Issue Links

          Activity

            People

              jxiang Jimmy Xiang
              jxiang Jimmy Xiang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: