Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-21648

[rsgroup] hbase shell "move_servers_rsgroup" or "balance_rsgroup" will be failed when meet a split region.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.1.0
    • None
    • Balancer, rsgroup
    • None

    Description

      A  example:

      We have a table "A" which is in RSGroup "group1".  "bd806f94a53be74e65bd76e1e6e16e5a" is a region of A and is opened on RS "rs1".

      Two steps will repeat this bug: 

      step1: Split region bd806f94a53be74e65bd76e1e6e16e5a

      step2: Before the region is cleared by CatalogJanitor, client runs shell : move_server_rsgroup 'group2', ['rs1:60020']  or balance_rsgroup 'group1'

      Finally, client will have exceptions below and rest regions moving will be interrupted. 

      ERROR: org.apache.hadoop.hbase.client.DoNotRetryRegionException: bd806f94a53be74e65bd76e1e6e16e5a is not OPEN
          at org.apache.hadoop.hbase.master.procedure.AbstractStateMachineTableProcedure.checkOnline(AbstractStateMachineTableProcedure.java:189)
          at org.apache.hadoop.hbase.master.assignment.MoveRegionProcedure.<init>(MoveRegionProcedure.java:71)
          at org.apache.hadoop.hbase.master.assignment.AssignmentManager.createMoveRegionProcedure(AssignmentManager.java:755)
          at org.apache.hadoop.hbase.master.assignment.AssignmentManager.move(AssignmentManager.java:560)
          at org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.moveServers(RSGroupAdminServer.java:349)
          at org.apache.hadoop.hbase.rsgroup.FGRSGroupAdminServer.moveServers(FGRSGroupAdminServer.java:119)
          at org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint$RSGroupAdminServiceImpl.moveServers(RSGroupAdminEndpoint.java:209)
          at org.apache.hadoop.hbase.protobuf.generated.RSGroupAdminProtos$RSGroupAdminService.callMethod(RSGroupAdminProtos.java:13870)
          at org.apache.hadoop.hbase.master.MasterRpcServices.execMasterService(MasterRpcServices.java:813)
          at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
          at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
          at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
          at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
          at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
      
      For usage try 'help "move_servers_rsgroup”'
      
      ERROR: org.apache.hadoop.hbase.client.DoNotRetryRegionException: bd806f94a53be74e65bd76e1e6e16e5a is not OPEN
          at org.apache.hadoop.hbase.master.procedure.AbstractStateMachineTableProcedure.checkOnline(AbstractStateMachineTableProcedure.java:189)
          at org.apache.hadoop.hbase.master.assignment.MoveRegionProcedure.<init>(MoveRegionProcedure.java:71)
          at org.apache.hadoop.hbase.master.assignment.AssignmentManager.createMoveRegionProcedure(AssignmentManager.java:755)
          at org.apache.hadoop.hbase.master.assignment.AssignmentManager.moveAsync(AssignmentManager.java:565)
          at org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.balanceRSGroup(RSGroupAdminServer.java:516)
          at org.apache.hadoop.hbase.rsgroup.FGRSGroupAdminServer.balanceRSGroup(FGRSGroupAdminServer.java:164)
          at org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint$RSGroupAdminServiceImpl.balanceRSGroup(RSGroupAdminEndpoint.java:296)
          at org.apache.hadoop.hbase.protobuf.generated.RSGroupAdminProtos$RSGroupAdminService.callMethod(RSGroupAdminProtos.java:13890)
          at org.apache.hadoop.hbase.master.MasterRpcServices.execMasterService(MasterRpcServices.java:813)
          at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
          at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
          at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
          at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
          at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
      
      For usage try 'help "balance_rsgroup"'

      Aflter splitting, this parent region will not be used anymore and will be cleared by CatalogJanitor in the future. So should we ignore moving it when doing move_server_rsgroup or balance_rsgroup?

      Attachments

        1. HBASE-21648.v1.patch
          1 kB
          xuming

        Activity

          People

            Unassigned Unassigned
            xuming xuming
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: