Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-26878

TableInputFormatBase should cache RegionSizeCalculator

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 2.5.0, 3.0.0-alpha-3, 2.4.12
    • None
    • None
    • Reviewed

    Description

      TableInputFormatBase's getSplits() method instantiates a new RegionSizeCalculator every time. Instantiating a RegionSizeCalculator involves scanning for all regionlocations for a given table in meta. This can be costly for large tables, and we don't know how often a subclass will call getSplits().

      When initializeTable is called, we already cache the RegionLocator and Admin that are used for passing into the RegionSizeCalculator. We should similarly cache the RegionSizeCalculator itself at that same time to avoid unnecessary meta scans on repeat getSplits() calls.

      Attachments

        Issue Links

          Activity

            People

              bbeaudreault Bryan Beaudreault
              bbeaudreault Bryan Beaudreault
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: