[LUCENE-5343] Potential floating point precision error in ConjuncionScorer.score() - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 3.6.1, 3.6.3, 4.5, 4.5.1
Fix Version/s: None
Component/s: core/query/scoring
Labels:
None

Lucene Fields:

New

Description

I have been investigating an issue with document scoring and found that the ConjunctionScorer implements the score method in a way that can cause floating point precision rounding issues. I noticed in some of my test cases that documents that have not been merged/optimized (I'm not sure of the correct terminology, they have a docNum of 0) have scorers added in a different order than optimized documents. Using a float to maintain the sum of scores introduces the potential for floating point precision errors. In turn this causes the score that is returned from the ConjunctionScorer to be different for some merged/unmerged documents that should have identical scores.

Example:

float sum1 = 0.0061859353f + 0.0061859353f + 0.0030929677f + 0.0030929677f + 0.0030929677f + 0.5010608f + 0.0061859353f;

float sum2 = 0.0061859353f + 0.0061859353f + 0.0061859353f + 0.0030929677f + 0.0030929677f + 0.0030929677f + 0.5010608f;

sum1 == 0.5288975; // Incorrect
sum2 == 0.52889746; // Correct

I also noticed that there is a comment in the 4.5.1 version of Lucene to the effect of:
// TODO: sum into a double and cast to float if we ever send required clauses to BS1

Is there a reason that this has not been implemented yet?

public float score() throws IOException {
double sum = 0.0d;
for (int i = 0; i < scorers.length; i++)

{ sum += scorers[i].score(); }

return (float)sum;
}

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Jonathan Hoag

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 17/Nov/13 20:11

Updated:: 28/Aug/22 13:57