Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-26036

DBB released too early and dirty data for some operations

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 3.0.0-alpha-1, 2.0.0
    • 3.0.0-alpha-1, 2.5.0, 2.4.5
    • rpc
    • None

    Description

      Before HBASE-25187, we found there are regionserver JVM crashing problems on our production clusters, the coredump infos are as follows,

      Stack: [0x00007f621ba8d000,0x00007f621bb8e000],  sp=0x00007f621bb8c0e0,  free space=1020k
      Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
      J 10829 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getTimestamp()J (9 bytes) @ 0x00007f6a5ee11b2d [0x00007f6a5ee11ae0+0x4d]
      J 22844 C2 org.apache.hadoop.hbase.regionserver.HRegion.doCheckAndRowMutate([B[B[BLorg/apache/hadoop/hbase/filter/CompareFilter$CompareOp;Lorg/apache/hadoop/hbase/filter/ByteArrayComparable;Lorg/apache/hadoop/hbase/client/RowMutations;Lorg/apache/hadoop/hbase/client/Mutation;Z)Z (540 bytes) @ 0x00007f6a60bed144 [0x00007f6a60beb320+0x1e24]
      J 17972 C2 org.apache.hadoop.hbase.regionserver.RSRpcServices.checkAndRowMutate(Lorg/apache/hadoop/hbase/regionserver/Region;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;[B[B[BLorg/apache/hadoop/hbase/filter/CompareFilter$CompareOp;Lorg/apache/hadoop/hbase/filter/ByteArrayComparable;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;)Z (312 bytes) @ 0x00007f6a5f4a7ed0 [0x00007f6a5f4a6f40+0xf90]
      J 26197 C2 org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(Lorg/apache/hbase/thirdparty/com/google/protobuf/RpcController;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MultiRequest;)Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MultiResponse; (644 bytes) @ 0x00007f6a61538b0c [0x00007f6a61537940+0x11cc]
      J 26332 C2 org.apache.hadoop.hbase.ipc.RpcServer.call(Lorg/apache/hadoop/hbase/ipc/RpcCall;Lorg/apache/hadoop/hbase/monitoring/MonitoredRPCHandler;)Lorg/apache/hadoop/hbase/util/Pair; (566 bytes) @ 0x00007f6a615e8228 [0x00007f6a615e79c0+0x868]
      J 20563 C2 org.apache.hadoop.hbase.ipc.CallRunner.run()V (1196 bytes) @ 0x00007f6a60711a4c [0x00007f6a60711000+0xa4c]
      J 19656% C2 org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(Ljava/util/concurrent/BlockingQueue;Ljava/util/concurrent/atomic/AtomicInteger;)V (338 bytes) @ 0x00007f6a6039a414 [0x00007f6a6039a320+0xf4]
      j  org.apache.hadoop.hbase.ipc.RpcExecutor$1.run()V+24
      j  java.lang.Thread.run()V+11
      v  ~StubRoutines::call_stub
      

      I have made a UT to reproduce this error, it can occur 100%。

      After HBASE-25187,the check result of the checkAndMutate will be false, because it read wrong/dirty data from the released ByteBuff.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Xiaolin Ha Xiaolin Ha
            Xiaolin Ha Xiaolin Ha
            Votes:
            0 Vote for this issue
            Watchers:
            13 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment