Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-26384

Segment already flushed to hfile may still be remained in CompactingMemStore

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      When CompactingMemStore prepares flushing, CompactingMemStore.snapshot invokes following CompactingMemStore.pushPipelineToSnapshot method to get Snapshot, following line 570 and line 575 uses CompactionPipeline#version to track whether the Segments in CompactionPipeline#pipeline has changed since it gets VersionedSegmentsList in line 570 before emptying CompactionPipeline#pipeline in line 575.

        565    private void pushPipelineToSnapshot() {
        566        int iterationsCnt = 0;
        567        boolean done = false;
        568        while (!done) {
        569              iterationsCnt++;
        570              VersionedSegmentsList segments = pipeline.getVersionedList();
        571              pushToSnapshot(segments.getStoreSegments());
        572              // swap can return false in case the pipeline was updated by ongoing compaction
        573             // and the version increase, the chance of it happenning is very low
        574             // In Swap: don't close segments (they are in snapshot now) and don't update the region size
        575            done = pipeline.swap(segments, null, false, false);
                          .......
        }
         

      However, when CompactingMemStore#inMemoryCompaction executes CompactionPipeline#flattenOneSegment, it does not change CompactionPipeline#version , if there is an in memeory compaction which executes CompactingMemStore#flattenOneSegment between above line 570 and line 575, the CompactionPipeline#version not change, but the Segment in CompactionPipeline has changed. Because CompactionPipeline#version not change, pipeline.swap in above line 575 could think it is safe to invoke following CompactionPipeline#swapSuffix method to remove Segment in CompactionPipeline , but the Segment in CompactionPipeline has changed because of CompactingMemStore#flattenOneSegment , so the Segment not removed in following line 295 and still remaining in CompactionPipeline.

        293  private void swapSuffix(List<? extends Segment> suffix, ImmutableSegment segment,
        294         boolean closeSegmentsInSuffix) {
        295      pipeline.removeAll(suffix);
        296      if(segment != null) pipeline.addLast(segment);
                   ....
      

      However CompactingMemStore.snapshot think it is successful and continues to flush the Segment got by CompactingMemStore.snapshot as normal, but the Segment with the same cells still be left in CompactingMemStore. Leaving Segment which already flushed in MemStore is dangerous: if a Major Compaction before the left Segment flushing, there may be data erroneous.

      My Fix in the PR is as following:

      1. Increasing the CompactionPipeline#version in CompactingMemStore#flattenOneSegment .
        Branch-2 has this problem but master not, because the branch-2 patch for HBASE-18375 omitting this.
      2. For CompactionPipeline#swapSuffix , explicitly checking that the Segment in suffix input parameter is same as the Segment in pipeline one by one from
        the last element to the first element of suffix , I think explicitly throwing Exception is better than hiding error and causing subtle problem.

      I made separate PRs for master and branch-2 so the code for master and brach-2 could consistent and master could also has UTs for this problem.
      PR#3777 is for master and PR#3779 is for branch-2.The difference between them is patch for brach-2 including following code in CompactionPipeline.replaceAtIndex which not included in branch-2 patch for HBASE-18375:

          // the version increment is indeed needed, because the swap uses removeAll() method of the
          // linked-list that compares the objects to find what to remove.
          // The flattening changes the segment object completely (creation pattern) and so
          // swap will not proceed correctly after concurrent flattening.
          version++;
      

      Attachments

        Issue Links

          Activity

            People

              comnetwork chenglei
              comnetwork chenglei
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: