Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-18215

some advises about refactoring of rsgroup

    XMLWordPrintableJSON

Details

    • Umbrella
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Balancer
    • None

    Description

      recently we have Integrated rsgroup into our cluster, after Integrated, found some refactoring points. maybe the points were not right, but i think there is a need to share with you guys.

      1. when hbase.balancer.tablesOnMaster configured, RSGroupBasedLoadBalancer should consider masterServer assignment first in balanceCluster, roundRobinAssignment, retainAssignment and randomAssignment
        do the same thing as BaseLoadBalancer
      2. why not use a local file as the persistence layer instead of rsgroup table.
        in our implementation, we first modify the local rsgroup file, then load the group info into memory, after that execute the balancer command, everything is OK.
        when loading do some sanity check:
        (1) one server can not be owned by multi group
        (2) one table can not be owned by multi group
        (3) if group has table, it must also has servers
        (4) default group must has servers in it
        if sanity check can’t pass, give up the following process.work as this, it can greatly reduce the complexity of rsgroup implementation, there is no need to wait for the rsgroup table to be online, and methods like moveServers, moveTables, addRSGroup, removeRSGroup, moveServersAndTables can be removed from RSGroupAdminService.only a refresh method is need(modify persistence layer first and refresh the memory)
      3. we should add some group informations on master web UI
        to do this, RSGroupBasedLoadBalancer should move to hbase-server module, because MasterStatusTmpl.jamon depends on it
      4. there may be some issues about RSGroupBasedLoadBalancer.roundRobinAssignment
        if two groups both include BOGUS_SERVER_NAME, assignments.putAll will overwrite the previous data
      5. there may be some issues about RSGroupBasedLoadBalancer.randomAssignment
        when the return value is BOGUS_SERVER_NAME, AM can not handle this case. we should return null value instead of BOGUS_SERVER_NAME.
      6. when RSGroupBasedLoadBalancer.balanceCluster execute, groups are balanced one by one, if there are two many groups, we can do this in parallel.

      Attachments

        Activity

          People

            javaman_chen chenxu
            javaman_chen chenxu
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated: