Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-21139

Concurrent invocations of MetricsTableAggregateSourceImpl.getOrCreateTableSource may return unregistered MetricsTableSource

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • None
    • None
    • None

    Description

      From test output of TestRestoreFlushSnapshotFromClient :

      2018-09-01 21:09:38,174 WARN  [member: 'hw13463.attlocal.net,49623,1535861370108' subprocedure-pool6-thread-1] snapshot.                                                          RegionServerSnapshotManager$SnapshotSubprocedurePool(348): Got Exception in SnapshotSubprocedurePool
      java.util.concurrent.ExecutionException: java.lang.NullPointerException
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:324)
        at org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:173)
        at org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.insideBarrier(FlushSnapshotSubprocedure.java:193)
        at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:189)
        at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:53)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
      Caused by: java.lang.NullPointerException
        at org.apache.hadoop.hbase.regionserver.MetricsTableSourceImpl.updateFlushTime(MetricsTableSourceImpl.java:375)
        at org.apache.hadoop.hbase.regionserver.MetricsTable.updateFlushTime(MetricsTable.java:56)
        at org.apache.hadoop.hbase.regionserver.MetricsRegionServer.updateFlush(MetricsRegionServer.java:210)
        at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2826)
        at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2444)
        at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2416)
        at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2306)
        at org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:2209)
        at org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:115)
        at org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:77)
      

      In MetricsTableAggregateSourceImpl.getOrCreateTableSource :

          MetricsTableSource prev = tableSources.putIfAbsent(table, source);
      
          if (prev != null) {
            return prev;
          } else {
            // register the new metrics now
            register(source);
      

      Suppose threads t1 and t2 execute the above code concurrently.
      t1 calls putIfAbsent first and proceeds to running register(source).
      Context switches, t2 gets to putIfAbsent and retrieves the instance stored by t1 which is not registered yet.
      We would end up with what the stack trace showed.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              yuzhihong@gmail.com Ted Yu
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: