Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
VectorExtractRow relies on the fact that vector[i] and length[i] are consistent within the BytesColumnVector, otherwise it throws exception:
https://github.com/apache/hive/blob/edc53cc0d95e983c371a224943dd866210f0c65c/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorExtractRow.java#L275
There is a scenario when only vector[i] has been cleaned while reusing the column vector, and then this kind of exception can be thrown:
the reproduction was made with LlapDump with String columns (longer than 16 chars)
19/10/17 15:55:49 ERROR llap.LlapArrowRowRecordReader: Failed to fetch Arrow batch java.lang.RuntimeException: STRING entry: batchIndex 45 at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.BytesReadError(VectorExtractRow.java:488) at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:294) at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:193) at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:483) at org.apache.hadoop.hive.ql.io.arrow.Deserializer.deserialize(Deserializer.java:125) at org.apache.hadoop.hive.ql.io.arrow.ArrowColumnarBatchSerDe.deserialize(ArrowColumnarBatchSerDe.java:284) at org.apache.hadoop.hive.llap.LlapArrowRowRecordReader.next(LlapArrowRowRecordReader.java:75) at org.apache.hadoop.hive.llap.LlapArrowRowRecordReader.next(LlapArrowRowRecordReader.java:41) at datareader.LlapDump.main(LlapDump.java:124)