Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-17755

CellBasedKeyBlockIndexReader#midkey should exhaust search of the target middle key on skewed regions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • HFile
    • None

    Description

      We have always been returning the middle key of the the block index regardless the distribution of the data on an HFile. A side effect of that approach is that when millions of rows share the same key its quite easy to run into a situation when the start key is equal to the middle key or when the end key is equal to the middle key making that HFile nearly impossible to split until enough data is written into the region and the middle key shifts to another row or when an operator uses a custom split point in order to split that region.

      Instead we should exhaust the search of the middle key in the block index in order to be able to split an HFile earlier when possible even if our edge case is to serve a region that could hold a single key with millions of versions of a row or with millions of qualifiers on the same row.

      Attachments

        Activity

          People

            esteban Esteban Gutierrez
            esteban Esteban Gutierrez
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: