Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-15642

EC:radmon error when enable ec policy on hbase data

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.1.1
    • None
    • dfsclient, ec, erasure-coding
    • None
    • hadoop 3.1.1
      EC RS-3-2-1024k

      6 datanodes

    Description

      We enable ec policy ’RS-3-2-1024k‘ on hbase data directory. Errors occur randomly when application request to regionserver, but, I can get the file use  shell commod 'hdfs dfs -get xxx'. This seems to be a problem with HDFS client. Logs are as follows:

       

      ERROR LOG from atlas

      Mon Oct 19 12:07:57 UTC 2020, RpcRetryingCaller{globalStartTime=1603109277452, pause=100, maxAttempts=16}, java.io.IOException: java.io.IOException: Could not seek StoreFileScanner[HFileScanner for reader reader=hdfs://xxxx/hbase/data/default/apache_atlas_janus_beta_v2/5f4e4eb280e505048a955306fd6a24ea/e/52625ed1fd18401d8ed2a10955113833, compression=gz, cacheConf=blockCache=LruBlockCache{blockCount=5423, currentSize=346.39 MB, freeSize=2.85 GB, maxSize=3.19 GB, heapSize=346.39 MB, minSize=3.03 GB, minFactor=0.95, multiSize=1.52 GB, multiFactor=0.5, singleSize=775.81 MB, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false, firstKey=Optional[\x90\x00\x00\x00\x00*\x08\x00/e:\x02/1577188525781/Put/seqid=0], lastKey=Optional[\xA8\x00\x00\x00\x05\xBA\x1B\x80/e:\xB9\x80\x00\x00D\x00e\x00v\x00i\x00c\x00e\x00P\x00o\x00i\x00n\x00t\x00T\x00a\x00g\x00\x00\x00\x00\xAAX-\x01\x00\xB0\x17j\x0CD\x15/1600779251239/Put/seqid=0], avgKeyLen=28, avgValueLen=30, entries=3537499, length=40156585, cur=null] to key org.apache.hadoop.hbase.PrivateCellUtil$FirstOnRowDeleteFamilyCell@13f66bbf
      at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:246)
      at org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:395)
      at org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:250)
      at org.apache.hadoop.hbase.regionserver.HStore.createScanner(HStore.java:2031)
      at org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2022)
      at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.initializeScanners(HRegion.java:6408)
      at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.<init>(HRegion.java:6388)
      at org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:2926)
      at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2906)
      at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2888)
      at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2882)
      at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2561)
      at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2488)
      at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42186)
      at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
      at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:132)
      at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
      at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
      Caused by: java.io.IOException: 3 missing blocks, the stripe is: AlignedStripe(Offset=184755, length=13077, fetchedChunksNum=0, missingChunksNum=3); locatedBlocks is: LocatedBlocks{; fileLength=40156585; underConstruction=false; blocks=[LocatedStripedBlock

      Unknown macro: {BP-1197414916-10.27.20.30-1535978156945}

      ]; lastLocatedBlock=LocatedStripedBlock{BP-1197414916-10.27.20.30-1535978156945:blk_-9223372036844403328_315953052; getBlockSize()=40156585; corrupt=false; offset=0; locs=[DatanodeInfoWithStorage[10.27.22.79:9866,DS-d79c402a-1845-4b3f-893e-f84d94085b2a,DISK], DatanodeInfoWithStorage[10.27.20.39:9866,DS-7db9e8fe-43db-490a-bda3-b20e3dc5128a,DISK], DatanodeInfoWithStorage[10.27.20.42:9866,DS-a5c499be-e9bb-464a-aba0-76a68ebff303,DISK], DatanodeInfoWithStorage[10.27.20.41:9866,DS-dddc7bcd-766f-4a2e-a9ca-f4d5cd97490b,DISK], DatanodeInfoWithStorage[10.27.20.40:9866,DS-f621b6a3-c1e8-49ed-bc54-6ced3c6944bb,DISK]]; indices=[0, 1, 2, 3, 4]}; isLastBlockComplete=true; ecPolicy=ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, numDataUnits=3, numParityUnits=2]], CellSize=1048576, Id=2]}
      at org.apache.hadoop.hdfs.StripeReader.checkMissingBlocks(StripeReader.java:177)
      at org.apache.hadoop.hdfs.StripeReader.readParityChunks(StripeReader.java:209)
      at org.apache.hadoop.hdfs.StripeReader.readStripe(StripeReader.java:339)
      at org.apache.hadoop.hdfs.DFSStripedInputStream.fetchBlockByteRange(DFSStripedInputStream.java:485)
      at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1348)
      at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1312)
      at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:92)
      at org.apache.hadoop.hbase.io.hfile.HFileBlock.positionalReadWithExtra(HFileBlock.java:808)
      at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readAtOffset(HFileBlock.java:1568)
      at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1772)
      at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1597)
      at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1496)
      at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:340)
      at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:856)
      at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:806)
      at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:327)
      at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:228)
      ... 17 more

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            lalapala gaozhan ding
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated: