Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-17623

Reuse the bytes array when building the hfile block

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.4.0, 2.0.0
    • Fix Version/s: 1.4.0, 2.0.0
    • Component/s: HFile
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      There are two improvements.

      1. The onDiskBlockBytesWithHeader should maintain a bytes array which can be reused when building the hfile.
      2. The onDiskBlockBytesWithHeader is copied to an new bytes array only when we need to cache the block.
      3. If no block need to be cached, the uncompressedBlockBytesWithHeader will never be created.
      HFileBlock.java
          private void finishBlock() throws IOException {
            if (blockType == BlockType.DATA) {
              this.dataBlockEncoder.endBlockEncoding(dataBlockEncodingCtx, userDataStream,
                  baosInMemory.getBuffer(), blockType);
              blockType = dataBlockEncodingCtx.getBlockType();
            }
            userDataStream.flush();
            // This does an array copy, so it is safe to cache this byte array when cache-on-write.
            // Header is still the empty, 'dummy' header that is yet to be filled out.
            uncompressedBlockBytesWithHeader = baosInMemory.toByteArray();
            prevOffset = prevOffsetByType[blockType.getId()];
      
            // We need to set state before we can package the block up for cache-on-write. In a way, the
            // block is ready, but not yet encoded or compressed.
            state = State.BLOCK_READY;
            if (blockType == BlockType.DATA || blockType == BlockType.ENCODED_DATA) {
              onDiskBlockBytesWithHeader = dataBlockEncodingCtx.
                  compressAndEncrypt(uncompressedBlockBytesWithHeader);
            } else {
              onDiskBlockBytesWithHeader = defaultBlockEncodingCtx.
                  compressAndEncrypt(uncompressedBlockBytesWithHeader);
            }
            // Calculate how many bytes we need for checksum on the tail of the block.
            int numBytes = (int) ChecksumUtil.numBytes(
                onDiskBlockBytesWithHeader.length,
                fileContext.getBytesPerChecksum());
      
            // Put the header for the on disk bytes; header currently is unfilled-out
            putHeader(onDiskBlockBytesWithHeader, 0,
                onDiskBlockBytesWithHeader.length + numBytes,
                uncompressedBlockBytesWithHeader.length, onDiskBlockBytesWithHeader.length);
            // Set the header for the uncompressed bytes (for cache-on-write) -- IFF different from
            // onDiskBlockBytesWithHeader array.
            if (onDiskBlockBytesWithHeader != uncompressedBlockBytesWithHeader) {
              putHeader(uncompressedBlockBytesWithHeader, 0,
                onDiskBlockBytesWithHeader.length + numBytes,
                uncompressedBlockBytesWithHeader.length, onDiskBlockBytesWithHeader.length);
            }
            if (onDiskChecksum.length != numBytes) {
              onDiskChecksum = new byte[numBytes];
            }
            ChecksumUtil.generateChecksums(
                onDiskBlockBytesWithHeader, 0, onDiskBlockBytesWithHeader.length,
                onDiskChecksum, 0, fileContext.getChecksumType(), fileContext.getBytesPerChecksum());
          }

        Attachments

        1. HBASE-17623.branch-1.v3.patch
          32 kB
          Chia-Ping Tsai
        2. HBASE-17623.branch-1.v3.patch
          32 kB
          Chia-Ping Tsai
        3. HBASE-17623.branch-1.v3.patch
          32 kB
          Chia-Ping Tsai
        4. HBASE-17623.v3.patch
          22 kB
          Chia-Ping Tsai
        5. HBASE-17623.v3.patch
          22 kB
          Chia-Ping Tsai
        6. GC measurement.xlsx
          16 kB
          Chia-Ping Tsai
        7. HBASE-17623.branch-1.v2.patch
          31 kB
          Chia-Ping Tsai
        8. HBASE-17623.branch-1.v2.patch
          31 kB
          Chia-Ping Tsai
        9. HBASE-17623.v2.patch
          20 kB
          Chia-Ping Tsai
        10. HBASE-17623.branch-1.v1.patch
          32 kB
          Chia-Ping Tsai
        11. HBASE-17623.branch-1.v0.patch
          32 kB
          Chia-Ping Tsai
        12. before(snappy_hfilesize=5.04GB).png
          353 kB
          Chia-Ping Tsai
        13. after(snappy_hfilesize=5.04GB).png
          391 kB
          Chia-Ping Tsai
        14. after(snappy_hfilesize=755MB).png
          191 kB
          Chia-Ping Tsai
        15. before(snappy_hfilesize=755MB).png
          201 kB
          Chia-Ping Tsai
        16. HBASE-17623.v1.patch
          21 kB
          Chia-Ping Tsai
        17. HBASE-17623.v1.patch
          21 kB
          Chia-Ping Tsai
        18. memory allocation measurement.xlsx
          12 kB
          Chia-Ping Tsai
        19. HBASE-17623.v0.patch
          21 kB
          Chia-Ping Tsai

          Activity

            People

            • Assignee:
              chia7712 Chia-Ping Tsai
              Reporter:
              chia7712 Chia-Ping Tsai
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: