Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-10332

Speed up Facets by enable batch reading of LongValues

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Won't Do
    • None
    • None
    • core/codecs
    • None
    • New

    Description

      In Lucene90DocValuesProducer, there are several places reading LongValues like this pattern:

      long startOffset = addresses.get(doc);
      bytes.length = (int) (addresses.get(doc + 1L) - startOffset);
      

      In these cases, we are needing to read 2 numbers stored together. It would be great if we can read 2 longs once. The luceneutil benchmark shows that some Facets tasks were speed up nearly 20% by this approach:

      Benchmark

                                  TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                 BrowseMonthSSDVFacets       17.25      (8.6%)       16.78     (17.8%)   -2.7% ( -26% -   25%) 0.536
                               LowTerm     1458.66      (3.6%)     1438.15      (4.4%)   -1.4% (  -9% -    6%) 0.268
                 HighTermDayOfYearSort      108.55     (10.0%)      108.04      (9.1%)   -0.5% ( -17% -   20%) 0.874
                            HighPhrase      168.65      (1.9%)      168.06      (2.3%)   -0.3% (  -4% -    3%) 0.602
                          OrNotHighLow     1201.79      (3.4%)     1197.93      (4.6%)   -0.3% (  -8% -    7%) 0.801
                          HighSpanNear       15.26      (1.6%)       15.21      (1.4%)   -0.3% (  -3% -    2%) 0.499
                               Respell       62.61      (1.8%)       62.45      (1.9%)   -0.3% (  -3% -    3%) 0.649
                             MedPhrase       57.57      (1.4%)       57.44      (1.8%)   -0.2% (  -3% -    2%) 0.648
                             OrHighMed      129.10      (3.0%)      128.83      (3.1%)   -0.2% (  -6% -    6%) 0.830
                           MedSpanNear       19.45      (2.3%)       19.41      (2.2%)   -0.2% (  -4% -    4%) 0.784
                            OrHighHigh       34.85      (1.5%)       34.79      (1.4%)   -0.2% (  -3% -    2%) 0.722
                  HighIntervalsOrdered       26.92      (4.7%)       26.89      (4.9%)   -0.1% (  -9% -    9%) 0.929
                                IntNRQ      343.52      (1.6%)      343.16      (2.0%)   -0.1% (  -3% -    3%) 0.855
                         OrHighNotHigh      595.61      (3.2%)      595.10      (4.3%)   -0.1% (  -7% -    7%) 0.944
                   MedIntervalsOrdered       17.66      (3.6%)       17.65      (3.8%)   -0.1% (  -7% -    7%) 0.961
                   LowIntervalsOrdered      109.23      (3.3%)      109.18      (3.5%)   -0.0% (  -6% -    7%) 0.969
                           AndHighHigh       81.09      (1.5%)       81.10      (2.0%)    0.0% (  -3% -    3%) 0.967
                           LowSpanNear      203.33      (2.1%)      203.41      (1.8%)    0.0% (  -3% -    3%) 0.948
                       MedSloppyPhrase       27.15      (1.5%)       27.17      (1.2%)    0.1% (  -2% -    2%) 0.907
                             LowPhrase       75.76      (1.8%)       75.81      (2.0%)    0.1% (  -3% -    3%) 0.904
               AndHighMedDayTaxoFacets       97.27      (1.9%)       97.35      (1.9%)    0.1% (  -3% -    4%) 0.888
                      HighSloppyPhrase       14.32      (2.7%)       14.34      (1.8%)    0.1% (  -4% -    4%) 0.870
                                Fuzzy2       76.00      (3.9%)       76.12      (3.4%)    0.2% (  -6% -    7%) 0.894
                              Wildcard      123.51      (1.8%)      123.71      (2.1%)    0.2% (  -3% -    4%) 0.796
                          OrHighNotLow      722.64      (4.4%)      724.15      (5.4%)    0.2% (  -9% -   10%) 0.894
                            AndHighLow      929.73      (4.0%)      931.75      (3.8%)    0.2% (  -7% -    8%) 0.859
                               Prefix3      240.13      (1.5%)      240.69      (1.9%)    0.2% (  -3% -    3%) 0.675
                            AndHighMed      210.17      (1.7%)      210.84      (1.6%)    0.3% (  -2% -    3%) 0.532
                       LowSloppyPhrase      142.83      (1.8%)      143.54      (2.0%)    0.5% (  -3% -    4%) 0.410
                          OrNotHighMed      709.24      (4.4%)      712.78      (4.3%)    0.5% (  -7% -    9%) 0.715
                                Fuzzy1       85.33      (5.7%)       85.77      (6.3%)    0.5% ( -10% -   13%) 0.786
                               MedTerm     1466.50      (3.5%)     1474.85      (3.9%)    0.6% (  -6% -    8%) 0.629
                            TermDTSort      105.51      (7.7%)      106.33      (7.3%)    0.8% ( -13% -   17%) 0.746
                              PKLookup      206.18      (2.9%)      208.68      (2.9%)    1.2% (  -4% -    7%) 0.179
                          OrHighNotMed      876.71      (3.0%)      887.84      (3.9%)    1.3% (  -5% -    8%) 0.251
                         OrNotHighHigh      774.25      (4.7%)      785.03      (6.0%)    1.4% (  -8% -   12%) 0.411
                     HighTermMonthSort       74.33      (9.4%)       75.47     (16.3%)    1.5% ( -22% -   30%) 0.716
                             OrHighLow      518.73      (5.2%)      528.27      (5.4%)    1.8% (  -8% -   13%) 0.272
                              HighTerm     1892.16      (3.4%)     1934.63      (5.5%)    2.2% (  -6% -   11%) 0.120
              AndHighHighDayTaxoFacets       16.46      (2.7%)       16.84      (2.3%)    2.3% (  -2% -    7%) 0.004
                  HighTermTitleBDVSort      141.39     (14.6%)      145.33     (15.1%)    2.8% ( -23% -   38%) 0.554
                  MedTermDayTaxoFacets       27.81      (2.1%)       29.54      (2.3%)    6.2% (   1% -   10%) 0.000
                OrHighMedDayTaxoFacets        3.05      (1.9%)        3.30      (2.2%)    8.3% (   4% -   12%) 0.000
             BrowseDayOfYearSSDVFacets       17.36     (13.0%)       18.97     (15.8%)    9.3% ( -17% -   43%) 0.042
             BrowseDayOfYearTaxoFacets        3.02      (3.6%)        3.79      (2.5%)   25.4% (  18% -   32%) 0.000
                  BrowseDateTaxoFacets        3.01      (3.6%)        3.79      (2.5%)   25.6% (  18% -   32%) 0.000
                 BrowseMonthTaxoFacets        3.14      (2.1%)        3.99      (2.5%)   27.0% (  21% -   32%) 0.000
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              gf2121 Feng Guo
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m