Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 3.4.0
-
ghx-label-1
Description
When reading parquet file in impala 3.4, encountered the following error:
I0714 16:11:48.307806 1075820 runtime-state.cc:207] 8c43203adb2d4fc8:0478df9b0000018b] Error from query 8c43203adb2d4fc8:0478df9b00000000: Invalid offset index in Parquet file hdfs://path/4844de7af4545a39-e8ebc7da0000005f_2015704758_data.0.parq. I0714 16:11:48.834901 1075838 status.cc:126] 8c43203adb2d4fc8:0478df9b000002c0] Invalid offset index in Parquet file hdfs://path/4844de7af4545a39-e8ebc7da0000005f_2015704758_data.0.parq. @ 0xbf4ef9 @ 0x1748c41 @ 0x174e170 @ 0x1750e58 @ 0x17519f0 @ 0x1748559 @ 0x1510b41 @ 0x1512c8f @ 0x137488a @ 0x1375759 @ 0x1b48a19 @ 0x7f34509f5e24 @ 0x7f344d5ed35c I0714 16:11:48.835763 1075838 runtime-state.cc:207] 8c43203adb2d4fc8:0478df9b000002c0] Error from query 8c43203adb2d4fc8:0478df9b00000000: Invalid offset index in Parquet file hdfs://path/4844de7af4545a39-e8ebc7da0000005f_2015704758_data.0.parq. I0714 16:11:48.893784 1075820 status.cc:126] 8c43203adb2d4fc8:0478df9b0000018b] Top level rows aren't in sync during page filtering in file hdfs://path/4844de7af4545a39-e8ebc7da0000005f_2015704758_data.0.parq. @ 0xbf4ef9 @ 0x1749104 @ 0x17494cc @ 0x1751aee @ 0x1748559 @ 0x1510b41 @ 0x1512c8f @ 0x137488a @ 0x1375759 @ 0x1b48a19 @ 0x7f34509f5e24 @ 0x7f344d5ed35c
Corresponding source code:
Status HdfsParquetScanner::CheckPageFiltering() { if (candidate_ranges_.empty() || scalar_readers_.empty()) return Status::OK(); int64_t current_row = scalar_readers_[0]->LastProcessedRow(); for (int i = 1; i < scalar_readers_.size(); ++i) { if (current_row != scalar_readers_[i]->LastProcessedRow()) { DCHECK(false); return Status(Substitute( "Top level rows aren't in sync during page filtering in file $0.", filename())); } } return Status::OK(); }
Attachments
Attachments
Issue Links
- relates to
-
IMPALA-10186 Write invalid parquet PageLocations which table sort by some columns
- Resolved