Details
-
Improvement
-
Status: Patch Available
-
Major
-
Resolution: Unresolved
-
9.0, 9.1, 9.2
-
None
-
None
-
New, Patch Available
Description
Problem Statement
We noticed DocValue read performance regression with the iterative API when upgrading from Lucene 5 to Lucene 9. Our latency is increased by 50%. The degradation is similar to what's described in https://issues.apache.org/jira/browse/SOLR-9599
By analyzing profiling data, we found method "advanceWithinBlock" and "advanceExactWithinBlock" for Sparse IndexedDISI is slow in Lucene 9 due to their O(N) doc lookup algorithm.
Changes
Used binary search algorithm to replace current O(N) lookup algorithm in Sparse IndexedDISI "advanceWithinBlock" and "advanceExactWithinBlock" because docs are in ascending order.
Test
./gradlew tidy ./gradlew check
Benchmark
06/30/2022 Update: The below benchmark data points are invalid. I started a new AWS EC2 instance and run the test. The performance of candidate and baseline are very close.
Ran sparseTaxis test cases from luceneutil. Attached the reports of baseline and candidates in attachments section.
1. Most cases have 5-10% search latency reduction.
2. Some highlights (>20%):
T0 green_pickup_latitude:[40.75 TO 40.9] yellow_pickup_latitude:[40.75 TO 40.9] sort=nullBaseline: 10973978+ hits hits in 726.81967 msecCandidate: 10973978+ hits hits in 484.544594 msec
T0 cab_color:y cab_color:g sort=nullBaseline: 2300174+ hits hits in 95.698324 msecCandidate: 2300174+ hits hits in 78.336193 msec
T1 cab_color:y cab_color:g sort=nullBaseline: 2300174+ hits hits in 391.565239 msecCandidate: 300174+ hits hits in 227.592885 msec{}
...
Attachments
Attachments
Issue Links
- links to