Details
-
Bug
-
Status: In Progress
-
Critical
-
Resolution: Unresolved
-
2.3.4
-
None
-
None
Description
2022-04-13 12:45:37,122 ERROR [regionserver/xxxxxx:26020-shortCompactions-0] regionserver.CompactSplit: Compaction failed region=XXX:XXX,002CX21205070934507532021052320210523174923,162 8091302516.7d2e05ad63b91843d438d2464a908d49., storeName=7d2e05ad63b91843d438d2464a908d49/info, priority=90, startTime=1649825135950 java.lang.NegativeArraySizeException at org.apache.hadoop.hbase.CellUtil.cloneQualifier(CellUtil.java:120) at org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray(ByteBufferKeyValue.java:112) at org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(CellUtil.java:1335) at org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(CellUtil.java:1318) at org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.getMidpoint(HFileWriterImpl.java:384) at org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.finishBlock(HFileWriterImpl.java:349) at org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.checkBlockBoundary(HFileWriterImpl.java:328) at org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.append(HFileWriterImpl.java:739) at org.apache.hadoop.hbase.regionserver.StoreFileWriter.append(StoreFileWriter.java:299) at org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:410) at org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:333) at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:65) at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:126) at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1544) at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2288) at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:619) at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:661) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
We encounter the exeception above many times. Usually, it will retry and compact success next time. But sometime it make getMidpoint return a wrong result and then make an abnormal index block as follow
068c892d122//LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0 1//LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0 068c896a6fc6f155beaddab036a4225ef79_12022011420220114205945/info:q/LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0
And this index block will lead to an endless loop in org.apache.hadoop.hbase.regionserver.KeyValueHeap#generalizedSeek
The cause of this problem is lastCellOfPreviousBlock reference to the cells in read path(HBASE-16372)
I have fixed it and will create a PR for it
Attachments
Issue Links
- links to