Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-21843

RegionGroupingProvider breaks the meta wal file name pattern which may cause data loss for meta region

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 3.0.0, 2.1.0, 2.2.0
    • Fix Version/s: 3.0.0, 2.2.0, 2.1.3, 2.0.5, 2.3.0
    • Component/s: wal
    • Labels:
    • Hadoop Flags:
      Reviewed

      Description

      A bit unusual, but managed to face this twice lately on both distributed and local standalone mode, on VMs. Somehow, after some VM pause/resume, got into a situation where regions on meta were assigned to a give RS startcode that had no corresponding WAL dir.

      That caused those regions to never get assigned, because the given RS startcode is not found anywhere by RegionServerTracker/ServerManager, so no SCP is created to this RS startcode, leaving the region "open" on a dead server forever, in META.

      Could get this sorted by adding extra check on loadMeta, checking if the RS assigned to the region in meta is not online and doesn't have a WAL dir, then mark this region as offline. 

        Attachments

        1. HBASE-21843.master.001.patch
          5 kB
          Wellington Chevreuil
        2. HBASE-21843.patch
          5 kB
          Duo Zhang

          Issue Links

            Activity

              People

              • Assignee:
                wchevreuil Wellington Chevreuil
                Reporter:
                wchevreuil Wellington Chevreuil
              • Votes:
                0 Vote for this issue
                Watchers:
                12 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: