Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.5.3
-
None
-
None
Description
We're in the process of upgrading a HBase installation from 2.2.4 to 2.5.3. We're currently using Zstd compression from our Hadoop installation. Due to some other class path issues (Netty issues in relation to the async WAL provider), we would like to remove Hadoop from the class path.
However, using the Zstd compression from HBase (which uses https://github.com/luben/zstd-jni) we seem to hit some incompatibility. When restarting a node to use this implementation we had errors like the following:
2023-03-10 16:33:01,925 WARN [RS_OPEN_REGION-regionserver/n2:16020-0] handler.AssignRegionHandler: Failed to open region NAMESPACE:TABLE,,1673888962751.cdb726dad4eaabf765969f195e91c737., will report to master java.io.IOException: java.io.IOException: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading data index and meta index from file hdfs://CLUSTER/hbase/data/NAMESPACE/TABLE/cdb726dad4eaabf765969f195e91c737/e/aea6eddaa8ee476197d064a4b4c345b9 at org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1148) at org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1091) at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:994) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:941) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7228) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7183) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7159) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7118) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7074) at org.apache.hadoop.hbase.regionserver.handler.AssignRegionHandler.process(AssignRegionHandler.java:147) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:100) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: java.io.IOException: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading data index and meta index from file hdfs://CLUSTER/hbase/data/NAMESPACE/TABLE/cdb726dad4eaabf765969f195e91c737/e/aea6eddaa8ee476197d064a4b4c345b9 at org.apache.hadoop.hbase.regionserver.StoreEngine.openStoreFiles(StoreEngine.java:288) at org.apache.hadoop.hbase.regionserver.StoreEngine.initialize(StoreEngine.java:338) at org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:297) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:6359) at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1114) at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1111) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) ... 3 more Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading data index and meta index from file hdfs://CLUSTER/hbase/data/NAMESPACE/TABLE/cdb726dad4eaabf765969f195e91c737/e/aea6eddaa8ee476197d064a4b4c345b9 at org.apache.hadoop.hbase.io.hfile.HFileInfo.initMetaAndIndex(HFileInfo.java:392) at org.apache.hadoop.hbase.regionserver.HStoreFile.open(HStoreFile.java:394) at org.apache.hadoop.hbase.regionserver.HStoreFile.initReader(HStoreFile.java:518) at org.apache.hadoop.hbase.regionserver.StoreEngine.createStoreFileAndReader(StoreEngine.java:225) at org.apache.hadoop.hbase.regionserver.StoreEngine.lambda$openStoreFiles$0(StoreEngine.java:266) ... 6 more Caused by: java.io.IOException: Premature EOF from inputStream, but still need 2883 bytes at org.apache.hadoop.hbase.io.util.BlockIOUtils.readFullyWithHeapBuffer(BlockIOUtils.java:153) at org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultDecodingContext.prepareDecoding(HFileBlockDefaultDecodingContext.java:104) at org.apache.hadoop.hbase.io.hfile.HFileBlock.unpack(HFileBlock.java:644) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl$1.nextBlock(HFileBlock.java:1397) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl$1.nextBlockWithBlockType(HFileBlock.java:1407) at org.apache.hadoop.hbase.io.hfile.HFileInfo.initMetaAndIndex(HFileInfo.java:365) ... 10 more
I've been able to reproduce the issue with something like:
Configuration conf = HBaseConfiguration.create(); conf.set("hbase.io.compress.zstd.codec", "org.apache.hadoop.hbase.io.compress.zstd.ZstdCodec"); HFileSystem fs = (HFileSystem) HFileSystem.get(conf); HFile.createReader(fs, new Path(...), conf);
with a file from HDFS that was created with the native compressor from Hadoop.
Note that I only suspect that this issue is caused by Zstd! In our test environment we are already running 2.5.3 with reasonable succes. This issue arises when we drop Hadoop from the class path and use the 'built in' compression. But that's not hard evidence of Zstd being the root cause of course.