Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.6.1, 3.6.3, 4.5, 4.5.1
-
None
-
None
-
New
Description
I have been investigating an issue with document scoring and found that the ConjunctionScorer implements the score method in a way that can cause floating point precision rounding issues. I noticed in some of my test cases that documents that have not been merged/optimized (I'm not sure of the correct terminology, they have a docNum of 0) have scorers added in a different order than optimized documents. Using a float to maintain the sum of scores introduces the potential for floating point precision errors. In turn this causes the score that is returned from the ConjunctionScorer to be different for some merged/unmerged documents that should have identical scores.
Example:
float sum1 = 0.0061859353f + 0.0061859353f + 0.0030929677f + 0.0030929677f + 0.0030929677f + 0.5010608f + 0.0061859353f;
float sum2 = 0.0061859353f + 0.0061859353f + 0.0061859353f + 0.0030929677f + 0.0030929677f + 0.0030929677f + 0.5010608f;
sum1 == 0.5288975; // Incorrect
sum2 == 0.52889746; // Correct
I also noticed that there is a comment in the 4.5.1 version of Lucene to the effect of:
// TODO: sum into a double and cast to float if we ever send required clauses to BS1
Is there a reason that this has not been implemented yet?
public float score() throws IOException {
double sum = 0.0d;
for (int i = 0; i < scorers.length; i++)
return (float)sum;
}