Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-28150

CreateTableProcedure and DeleteTableProcedure should sleep a while before retrying

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      create a table, but it failed when execute CREATE_TABLE_WRITE_FS_LAYOUT, then will try again and again, will write too many proc record to master:store, we find num of the master WAL in oldWALs more than 13000..

       

      Q: should add a  suspend time logic for create table proc retry? i see TransitRegionStateProcedure has the logic..

       

      ---------------------------------------------------------------

      sorry, i upload screenshot failed, just copy to here

      // 2023-10-12 12:34:35,360 | INFO  | RegionOpenAndInit-themis:a-pool-0 | Closing region themis:a,,1697025107991.513d3d5b4d3ad5c8f13bacea4a888d69. | org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1688)
      2023-10-12 12:34:35,360 | INFO  | RegionOpenAndInit-themis:a-pool-0 | Closed themis:a,,1697025107991.513d3d5b4d3ad5c8f13bacea4a888d69. | org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1900)
      2023-10-12 12:34:35,360 | INFO  | PEWorker-1 | Region directories are created at hdfs://hacluster/hbase/.tmp for table themis:a | org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.createFsLayout(CreateTableProcedure.java:346)
      2023-10-12 12:34:35,362 | WARN  | PEWorker-1 | Retriable error trying to create table=themis:a state=CREATE_TABLE_WRITE_FS_LAYOUT | org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:159)
      java.io.IOException: Unable to move table from temp=hdfs://hacluster/hbase/.tmp/data/themis/a to hbase root=hdfs://hacluster/hbase/data/themis/a
              at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.moveTempDirectoryToHBaseRoot(CreateTableProcedure.java:391)
              at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.createFsLayout(CreateTableProcedure.java:350)
              at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.createFsLayout(CreateTableProcedure.java:318)
              at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:121)
              at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:75)
              at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:188)
              at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:922)
              at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1650)
              at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1396)
              at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1000(ProcedureExecutor.java:75)
              at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.runProcedure(ProcedureExecutor.java:1962)
              at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:221)
              at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1988)
      2023-10-12 12:34:35,387 | INFO  | PEWorker-1 | pid=917, state=RUNNABLE:CREATE_TABLE_WRITE_FS_LAYOUT, locked=true; CreateTableProcedure table=themis:a execute state=CREATE_TABLE_WRITE_FS_LAYOUT | org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:102)
      2023-10-12 12:34:35,414 | INFO  | RegionOpenAndInit-themis:a-pool-0 | creating {ENCODED => 513d3d5b4d3ad5c8f13bacea4a888d69, NAME => 'themis:a,,1697025107991.513d3d5b4d3ad5c8f13bacea4a888d69.', STARTKEY => '', ENDKEY => ''}, tableDescriptor='themis:a', {NAME => 'f1', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}, regionDir=hdfs://hacluster/hbase/.tmp | org.apache.hadoop.hbase.regionserver.HRegion.createHRegion(HRegion.java:7906)
      2023-10-12 12:34:35,432 | INFO  | RegionOpenAndInit-themis:a-pool-0 | Waiting for flushes and compactions to finish for the region themis:a,,1697025107991.513d3d5b4d3ad5c8f13bacea4a888d69. | org.apache.hadoop.hbase.regionserver.HRegion.waitForFlushesAndCompactions(HRegion.java:1911)
      2023-10-12 12:34:35,432 | INFO  | RegionOpenAndInit-themis:a-pool-0 | Total wait time for flushes and compaction for the region themis:a,,1697025107991.513d3d5b4d3ad5c8f13bacea4a888d69. is: 0ms | org.apache.hadoop.hbase.regionserver.HRegion.waitForFlushesAndCompactions(HRegion.java:1946)
      2023-10-12 12:34:35,432 | INFO  | RegionOpenAndInit-themis:a-pool-0 | Closing region themis:a,,1697025107991.513d3d5b4d3ad5c8f13bacea4a888d69. | org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1688)
      2023-10-12 12:34:35,432 | INFO  | RegionOpenAndInit-themis:a-pool-0 | Closed themis:a,,1697025107991.513d3d5b4d3ad5c8f13bacea4a888d69. | org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1900)
      2023-10-12 12:34:35,432 | INFO  | PEWorker-1 | Region directories are created at hdfs://hacluster/hbase/.tmp for table themis:a | org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.createFsLayout(CreateTableProcedure.java:346)
      2023-10-12 12:34:35,434 | WARN  | PEWorker-1 | Retriable error trying to create table=themis:a state=CREATE_TABLE_WRITE_FS_LAYOUT | org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:159)
      java.io.IOException: Unable to move table from temp=hdfs://hacluster/hbase/.tmp/data/themis/a to hbase root=hdfs://hacluster/hbase/data/themis/a
              at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.moveTempDirectoryToHBaseRoot(CreateTableProcedure.java:391)
              at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.createFsLayout(CreateTableProcedure.java:350)
              at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.createFsLayout(CreateTableProcedure.java:318)
              at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:121)
              at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:75)
              at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:188)
              at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:922)
              at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1650)
              at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1396)
              at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1000(ProcedureExecutor.java:75)
              at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.runProcedure(ProcedureExecutor.java:1962)
              at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:221)
              at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1988)
      2023-10-12 12:34:35,469 | INFO  | PEWorker-1 | pid=917, state=RUNNABLE:CREATE_TABLE_WRITE_FS_LAYOUT, locked=true; CreateTableProcedure table=themis:a execute state=CREATE_TABLE_WRITE_FS_LAYOUT | org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:102)
       
      //hdfs dfs -ls /hbase/oldWALs | grep 'masterlocal' |wc -l
      13398
       

       

      analysis:

      this was beacuse i delete namespace dir in HDFS directly...but did not delete from hbase:namespce, so when i want to create a table in this namespce will hang....

      it's a operation error...

      but if some logic failed in CreateTableProcedure, i think will cause this issue again...

      Attachments

        1. HBASE-28150.patch
          14 kB
          chaijunjie

        Issue Links

          Activity

            People

              chaijunjie chaijunjie
              chaijunjie chaijunjie
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: