Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
Impala 2.5.0
Description
Readable query to showcase the problem:
SELECT t1.id, v3.id, v3.int_col, v3.bigint_col FROM functional.alltypestiny t1 LEFT OUTER JOIN (SELECT id, int_col, COALESCE(bigint_col, 4000) AS bigint_col FROM (SELECT id, COALESCE(int_col, 3000) AS int_col, bigint_col FROM (SELECT COALESCE(id + 100, 2000) AS id, int_col, bigint_col FROM functional.alltypestiny t2 ) v1 ) v2 ) v3 ON t1.id = v3.id
The query above used to not return NULLs on the nullable side of the outer join although no rows from t1 match with those from v3 based on the join condition.
Original query produced by the query generator:
WITH with_1 AS (SELECT COALESCE(a2.int_col, a2.int_col, a2.int_col) AS int_col FROM ( SELECT IF(LAG(True, 21) OVER (ORDER BY a1.double_col ASC, a1.string_col), LAG(-680) OVER (ORDER BY a1.string_col ASC, a1.double_col), -793) AS int_col FROM alltypestiny a1 ) a2) SELECT a4.int_col, a3.month FROM alltypes a3 LEFT JOIN with_1 a4 ON ((a4.int_col) = (a3.month))
First few lines of the result:
+---------+-------+ | int_col | month | +---------+-------+ | -793 | 8 | | -793 | 8 | | -793 | 8 | | -793 | 8 | | -793 | 8 | | -793 | 8 | | -793 | 8 | | -793 | 8 | | -793 | 8 | | -793 | 8 | +---------+-------+
Since we are joining on ((a4.int_col) = (a3.month)), both columns in the result set should be the same (or the right column can be NULL).