Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-28519

HMaster crash when upgrading from HBase-2.5.8 to HBase-3.0.0-beta-1

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 3.0.0-alpha-1, 3.0.0-alpha-4, 3.0.0-beta-1, 2.5.8
    • 3.0.0-beta-2
    • master
    • None

    Description

      Reproduce

      Step1: Start up HBase-2.5.8 cluster: 4 nodes: 1 HM, 2 RS, 1 HDFS (hadoop-2.10.2).

      Step2: Perform a full-stop upgrade to HBase-3.0.0-beta-1 cluster: 1 HM, 2 RS, 1 HDFS (hadoop-2.10.2). (No command is needed before the upgrade)

      HMaster aborts with the following exception

      2024-04-13T03:47:15,969 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Failed to become active master
      java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED
              at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5]
              at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:324) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5]
              at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1535) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1]
              at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1204) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1]
              at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2494) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1]
              at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1]
              at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-1.jar:3.0.0-beta-1]
              at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362]
      Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 2 actions: RetriesExhaustedException: 2 times, servers with issues: 
              at org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.makeError(BufferedMutatorOverAsyncBufferedMutator.java:107) ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1]
              at org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.internalFlush(BufferedMutatorOverAsyncBufferedMutator.java:122) ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1]
              at org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.close(BufferedMutatorOverAsyncBufferedMutator.java:166) ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1]
              at org.apache.hadoop.hbase.master.TableNamespaceManager.migrateNamespaceTable(TableNamespaceManager.java:92) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1]
              at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:122) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1]
              at org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:61) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1]
              at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:252) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5]
              at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1532) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1]
              ... 5 more
      2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Master server abort: loaded coprocessors are: [org.apache.hadoop.hbase.quotas.MasterQuotasObserver]
      2024-04-13T03:47:15,970 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: ***** ABORTING master hmaster,16000,1712980015693: Unhandled exception. Starting shutdown. *****
      java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED
              at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:384) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5]
              at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:324) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5]
              at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1535) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1]
              at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1204) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1]
              at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2494) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1]
              at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1]
              at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-1.jar:3.0.0-beta-1]
              at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362]
      Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 2 actions: RetriesExhaustedException: 2 times, servers with issues: 
              at org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.makeError(BufferedMutatorOverAsyncBufferedMutator.java:107) ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1]
              at org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.internalFlush(BufferedMutatorOverAsyncBufferedMutator.java:122) ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1]
              at org.apache.hadoop.hbase.client.BufferedMutatorOverAsyncBufferedMutator.close(BufferedMutatorOverAsyncBufferedMutator.java:166) ~[hbase-client-3.0.0-beta-1.jar:3.0.0-beta-1]
              at org.apache.hadoop.hbase.master.TableNamespaceManager.migrateNamespaceTable(TableNamespaceManager.java:92) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1]
              at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:122) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1]
              at org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:61) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1]
              at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:252) ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5]
              at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1532) ~[hbase-server-3.0.0-beta-1.jar:3.0.0-beta-1]
              ... 5 more
      2024-04-13T03:47:15,971 INFO  [master/hmaster:16000:becomeActiveMaster] master.HMaster: ***** STOPPING master 'hmaster,16000,1712980015693' *****
      2024-04-13T03:47:15,971 INFO  [master/hmaster:16000:becomeActiveMaster] master.HMaster: STOPPED: Stopped by master/hmaster:16000:becomeActiveMaster
      2024-04-13T03:47:15,971 INFO  [master/hmaster:16000] hbase.HBaseServerBase: Stop info server 

      I also tried upgrading from 2.5.8 to 3.0.0-alpha-1 and 3.0.0-alpha-4, they will encounter the same exception.

      I have attached my full logs.

      Root Cause

      The exception is thrown when migrating Namespace data. The HBase doc mentions "hbase:namespace table has been removed and fold into hbase:meta". However, the exact root cause is still unclear.

      Attachments

        1. hbase--master-64f850a4e287.log
          152 kB
          Ke Han
        2. all_logs.tar.gz
          90 kB
          Ke Han

        Activity

          People

            Unassigned Unassigned
            kehan5800 Ke Han
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: