Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-26567

Remove IndexType from ChunkCreator

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0-alpha-1, 2.4.8
    • 2.5.0, 3.0.0-alpha-3
    • in-memory-compaction
    • None
    • Reviewed

    Description

      For CellChunkImmutableSegment, cells in data chunks is indexed in index chunks by chunk id ,index chunks use chunk id to locate the data chunk by look up chunk id in ChunkCreator.chunkIdMap . Whether or not the ChunkCreator put the data chunk in ChunkCreator.chunkIdMap depends on the data chunk is pooled or the index type is IndexType.CHUNK_MAP. So when we call ChunkCreator.getChunk, it is very important to specify the IndexType. ChunkCreator has following package-visible method, which does not require the IndexType parameter and use IndexType.ARRAY_MAP.

        Chunk getChunk(ChunkType chunkType) {
          return getChunk(CompactingMemStore.IndexType.ARRAY_MAP, chunkType);
        }
      

      This method is used by following line 363 in MemStoreLABImpl.getNewExternalChunk now, which is used by CellChunkImmutableSegment.copyCellIntoMSLAB which index type is IndexType.CHUNK_MAP to force copy a big cell into MemStoreLAB, which should allocate a new not-pooled chunk.

      359  public Chunk getNewExternalChunk(ChunkCreator.ChunkType chunkType) {
      360    switch (chunkType) {
      361      case INDEX_CHUNK:
      362      case DATA_CHUNK:
      363        Chunk c = this.chunkCreator.getChunk(chunkType);
      364        chunks.add(c.getId());
      365        return c;
      366      case JUMBO_CHUNK: // a jumbo chunk doesn't have a fixed size
      367      default:
      368        return null;
      369    }
      370  }
      

      Even though fortunately this IndexType mismatch does not cause bug now because MemStoreLABImpl.getNewExternalChunk in fact is not invoked by CellChunkImmutableSegment.copyCellIntoMSLAB when the cell is bigger than chunkSize,but I think it is dangerous to not specify the IndexType if we want to refactor or add new functionalities in the future. We would better to specify the IndexType explicitly when we use ChunkCreator.getChunk.

      When thinking this problem more, I think we could remove the IndexType from ChunkCreator completely to simplify the code, because when use MemStoreLABImpl, the IndexType could only be IndexType.CHUNK_MAP ,just as following line 106 in MemStoreLABImpl ctor.

      96  public MemStoreLABImpl(Configuration conf) {
      97    dataChunkSize = conf.getInt(CHUNK_SIZE_KEY, CHUNK_SIZE_DEFAULT);
      98    maxAlloc = conf.getInt(MAX_ALLOC_KEY, MAX_ALLOC_DEFAULT);
      99    this.chunkCreator = ChunkCreator.getInstance();
      100    // if we don't exclude allocations >CHUNK_SIZE, we'd infiniteloop on one!
      101    Preconditions.checkArgument(maxAlloc <= dataChunkSize,
      102        MAX_ALLOC_KEY + " must be less than " + CHUNK_SIZE_KEY);
      103
      104     // if user requested to work with MSLABs (whether on- or off-heap), then the
      105    // immutable segments are going to use CellChunkMap as their index
      106    idxType = CompactingMemStore.IndexType.CHUNK_MAP;
      107  }
      

      And because the get chunk from ChunkCreator is only used by MemStoreLABImpl, and only when we not use MemStoreLABImpl we could use IndexType.ARRAY_MAP, we can remove the IndexType from ChunkCreator completely.

      Attachments

        Issue Links

          Activity

            People

              comnetwork chenglei
              comnetwork chenglei
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: