Details
-
Bug
-
Status: Resolved
-
P1
-
Resolution: Fixed
-
0.2.0-incubating, 2.22.0
Description
When there is a NaN value in the PCollection passed into Min or Max we get a random value back due to the way the CombineFn works. Per the SQL standard, we should always get NaN back. I'm going to add a special case get the right answer.
Looks like we switched from using Double.compare to `>=` operator in https://github.com/apache/beam/commit/21a5b44c3b541ba6c89df5649afe00412df73d10, which introduced a data corruption bug.
A test case demonstrating this issue:
@Test public void testDouble() { Assert.assertFalse(Double.NaN >= 0.9); Assert.assertFalse(0.9 >= Double.NaN); Assert.assertFalse(Double.NaN >= Double.POSITIVE_INFINITY); Assert.assertFalse(Double.POSITIVE_INFINITY >= Double.NaN); Assert.assertTrue(Double.compare(Double.NaN, 0.9) >= 0); Assert.assertFalse(Double.compare(0.9, Double.NaN) >= 0); Assert.assertTrue(Double.compare(Double.NaN, Double.POSITIVE_INFINITY) >= 0); Assert.assertFalse(Double.compare(Double.POSITIVE_INFINITY, Double.NaN) >= 0); }
Attachments
Issue Links
- links to