Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
A common sql like
select category as category, count(distinct maskdid) as uv from dwd_internal_inc_d group by category
can have a wrong result on the trunk, the result of column category can be confused and
aggregate of distinct maskdid is also wrong.
After some debugging, We find that the problem is caused by wrong byteStarts[i] when using it to copy the current keys to the reusable keys:
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/wrapper/VectorHashKeyWrapperGeneral.java#L351-L362
The byteStarts[i] is always 0 due to Arrays.fill(byteStarts, 0); so it copies the range from 0 other then the real start index to len of the current keys to the reusable keys when clone.byteValues[i].length >= byteValues[i].length met, which results to the problem.
Attachments
Issue Links
- fixes
-
HIVE-24139 VectorGroupByOperator is not flushing hash table entries as needed
- Resolved
- links to