Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-27706

Additional Zstandard codec compatible with the Hadoop native one

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.5.3
    • None
    • compatibility
    • None

    Description

       

      We're in the process of upgrading a HBase installation from 2.2.4 to 2.5.3. We're currently using Zstd compression from our Hadoop installation. Due to some other class path issues (Netty issues in relation to the async WAL provider), we would like to remove Hadoop from the class path.

      However, using the Zstd compression from HBase (which uses https://github.com/luben/zstd-jni) we seem to hit some incompatibility. When restarting a node to use this implementation we had errors like the following:

      2023-03-10 16:33:01,925 WARN  [RS_OPEN_REGION-regionserver/n2:16020-0] handler.AssignRegionHandler: Failed to open region NAMESPACE:TABLE,,1673888962751.cdb726dad4eaabf765969f195e91c737., will report to master
      java.io.IOException: java.io.IOException: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading data index and meta index from file hdfs://CLUSTER/hbase/data/NAMESPACE/TABLE/cdb726dad4eaabf765969f195e91c737/e/aea6eddaa8ee476197d064a4b4c345b9
              at org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1148)
              at org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1091)
              at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:994)
              at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:941)
              at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7228)
              at org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7183)
              at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7159)
              at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7118)
              at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7074)
              at org.apache.hadoop.hbase.regionserver.handler.AssignRegionHandler.process(AssignRegionHandler.java:147)
              at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:100)
              at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
              at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
              at java.base/java.lang.Thread.run(Thread.java:829)
      Caused by: java.io.IOException: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading data index and meta index from file hdfs://CLUSTER/hbase/data/NAMESPACE/TABLE/cdb726dad4eaabf765969f195e91c737/e/aea6eddaa8ee476197d064a4b4c345b9
              at org.apache.hadoop.hbase.regionserver.StoreEngine.openStoreFiles(StoreEngine.java:288)
              at org.apache.hadoop.hbase.regionserver.StoreEngine.initialize(StoreEngine.java:338)
              at org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:297)
              at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:6359)
              at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1114)
              at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1111)
              at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
              at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
              at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
              ... 3 more
      Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading data index and meta index from file hdfs://CLUSTER/hbase/data/NAMESPACE/TABLE/cdb726dad4eaabf765969f195e91c737/e/aea6eddaa8ee476197d064a4b4c345b9
              at org.apache.hadoop.hbase.io.hfile.HFileInfo.initMetaAndIndex(HFileInfo.java:392)
              at org.apache.hadoop.hbase.regionserver.HStoreFile.open(HStoreFile.java:394)
              at org.apache.hadoop.hbase.regionserver.HStoreFile.initReader(HStoreFile.java:518)
              at org.apache.hadoop.hbase.regionserver.StoreEngine.createStoreFileAndReader(StoreEngine.java:225)
              at org.apache.hadoop.hbase.regionserver.StoreEngine.lambda$openStoreFiles$0(StoreEngine.java:266)
              ... 6 more
      Caused by: java.io.IOException: Premature EOF from inputStream, but still need 2883 bytes
              at org.apache.hadoop.hbase.io.util.BlockIOUtils.readFullyWithHeapBuffer(BlockIOUtils.java:153)
              at org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultDecodingContext.prepareDecoding(HFileBlockDefaultDecodingContext.java:104)
              at org.apache.hadoop.hbase.io.hfile.HFileBlock.unpack(HFileBlock.java:644)
              at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl$1.nextBlock(HFileBlock.java:1397)
              at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl$1.nextBlockWithBlockType(HFileBlock.java:1407)
              at org.apache.hadoop.hbase.io.hfile.HFileInfo.initMetaAndIndex(HFileInfo.java:365)
              ... 10 more 

      I've been able to reproduce the issue with something like:

      Configuration conf = HBaseConfiguration.create();
      conf.set("hbase.io.compress.zstd.codec", "org.apache.hadoop.hbase.io.compress.zstd.ZstdCodec");
      HFileSystem fs = (HFileSystem) HFileSystem.get(conf);
      HFile.createReader(fs, new Path(...), conf); 

      with a file from HDFS that was created with the native compressor from Hadoop.

      Note that I only suspect that this issue is caused by Zstd! In our test environment we are already running 2.5.3 with reasonable succes. This issue arises when we drop Hadoop from the class path and use the 'built in' compression. But that's not hard evidence of Zstd being the root cause of course.

      Attachments

        Activity

          People

            Unassigned Unassigned
            frensjan Frens Jan Rumph
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: