Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-21745

Make HBCK2 be able to fix issues other than region assignment

    XMLWordPrintableJSON

Details

    • Reviewed
    • Hide
      This issue adds via its subtasks:

       * An 'HBCK Report' page to the Master UI added by HBASE-22527+HBASE-22709+HBASE-22723+ (since 2.1.6, 2.2.1, 2.3.0). Lists consistency or anomalies found via new hbase:meta consistency checking extensions added to CatalogJanitor (holes, overlaps, bad servers) and by a new 'HBCK chore' that runs at a lesser periodicity that will note filesystem orphans and overlaps as well as the following conditions:
       ** Master thought this region opened, but no regionserver reported it.
       ** Master thought this region opened on Server1, but regionserver reported Server2
       ** More than one regionservers reported opened this region
       Both chores can be triggered from the shell to regenerate ‘new’ reports.
       * Means of scheduling a ServerCrashProcedure (HBASE-21393).
       * An ‘offline’ hbase:meta rebuild (HBASE-22680).
       * Offline replace of hbase.version and hbase.id
       * Documentation on how to use completebulkload tool to ‘adopt’ orphaned data found by new HBCK2 ‘filesystem’ check (see below) and ‘HBCK chore’ (HBASE-22859)
       * A ‘holes’ and ‘overlaps’ fix that runs in the master that uses new bulk-merge facility to collapse many overlaps in the one go.
       * hbase-operator-tools HBCK2 client tool got a bunch of additions:
       ** A specialized 'fix' for the case where operators ran old hbck 'offlinemeta' repair and destroyed their hbase:meta; it ties together holes in meta with orphaned data in the fs (HBASE-22567)
       ** A ‘filesystem’ command that reports on orphan data as well as bad references and hlinks with a ‘fix’ for the latter two options (based on hbck1 facility updated).
       ** Adds back the ‘replication’ fix facility from hbck1 (HBASE-22717)

      The compound result is that hbck2 is now in excess of hbck1 abilities. The provided functionality is disaggregated as per the hbck2 philosophy of providing 'plumbing' rather than 'porcelain' so there is work to do still adding fix-it playbooks, scripting across outages, and automation.
      Show
      This issue adds via its subtasks:  * An 'HBCK Report' page to the Master UI added by HBASE-22527 + HBASE-22709 + HBASE-22723 + (since 2.1.6, 2.2.1, 2.3.0). Lists consistency or anomalies found via new hbase:meta consistency checking extensions added to CatalogJanitor (holes, overlaps, bad servers) and by a new 'HBCK chore' that runs at a lesser periodicity that will note filesystem orphans and overlaps as well as the following conditions:  ** Master thought this region opened, but no regionserver reported it.  ** Master thought this region opened on Server1, but regionserver reported Server2  ** More than one regionservers reported opened this region  Both chores can be triggered from the shell to regenerate ‘new’ reports.  * Means of scheduling a ServerCrashProcedure ( HBASE-21393 ).  * An ‘offline’ hbase:meta rebuild ( HBASE-22680 ).  * Offline replace of hbase.version and hbase.id  * Documentation on how to use completebulkload tool to ‘adopt’ orphaned data found by new HBCK2 ‘filesystem’ check (see below) and ‘HBCK chore’ ( HBASE-22859 )  * A ‘holes’ and ‘overlaps’ fix that runs in the master that uses new bulk-merge facility to collapse many overlaps in the one go.  * hbase-operator-tools HBCK2 client tool got a bunch of additions:  ** A specialized 'fix' for the case where operators ran old hbck 'offlinemeta' repair and destroyed their hbase:meta; it ties together holes in meta with orphaned data in the fs ( HBASE-22567 )  ** A ‘filesystem’ command that reports on orphan data as well as bad references and hlinks with a ‘fix’ for the latter two options (based on hbck1 facility updated).  ** Adds back the ‘replication’ fix facility from hbck1 ( HBASE-22717 ) The compound result is that hbck2 is now in excess of hbck1 abilities. The provided functionality is disaggregated as per the hbck2 philosophy of providing 'plumbing' rather than 'porcelain' so there is work to do still adding fix-it playbooks, scripting across outages, and automation.

    Description

      This is what apurtell posted on mailing-list, HBCK2 should support

      Attachments

        Issue Links

          There are no Sub-Tasks for this issue.

          Activity

            People

              stack Michael Stack
              zhangduo Duo Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              28 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: