Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-26850

Optimize the implementation of LRUCache in LRUDictionary

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsAdd voteVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Reopened
    • Major
    • Resolution: Unresolved
    • 1.7.1, 3.0.0-alpha-2, 2.4.11
    • None
    • wal
    • None

    Description

      This issue is to unify the behavior of reading and writing links.

       

      During my research on HBASE-26849, I found that there seems to be something wrong with the implementation of LRUDictionary.

      It uses array to save data, and uses hashMap to save array index.

      For the write link:

      CompressedKvEncoder#write
      ->
      LRUDictionary#findEntry
      ->
      LRUDictionary#addEntry
      

      And for the read link:

      CompressedKVDecoder#readIntoArray
      ->
      LRUDirectory#addEntry
      

      Then we could see the logic in findEntry:

        @Override
        public short findEntry(byte[] data, int offset, int length) {
          short ret = backingStore.findIdx(data, offset, length);
          if (ret == NOT_IN_DICTIONARY) {
            addEntry(data, offset, length);
          }
          return ret;
        }
      
          private short findIdx(byte[] array, int offset, int length) {
            Short s;
            final Node comparisonNode = new Node();
            comparisonNode.setContents(array, offset, length);
            if ((s = nodeToIndex.get(comparisonNode)) != null) {
              moveToHead(indexToNode[s]);
              return s;
            } else {
              return -1;
            }
          }
      

      At first it will try to find if there is same node in the cache, If there is, use it directly.

      The problem is, if we have some same nodes, The behavior on the read and write links are inconsistent:
      On the write link we will reuse just one node, but on the read link multiple copies will be stored in the array.
      but there is only one mapping from the node to the array index in the map.
      So, on the read link, if the first 'same' node is evict, we will remove it from the map, then we might met a NPE because when the second 'same' node evict we can not find it in the map.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            tangtianhang tianhang tang Assign to me
            tangtianhang tianhang tang

            Dates

              Created:
              Updated:

              Slack

                Issue deployment