Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 4.2.0
-
ghx-label-13
Description
Current parquet writer write -1 of PageLocation.offset and PageLocation.first_row_index when meet a empty page.
hdfs-parquet-file-writer.cc Line: 808 ~ 819
// Write data pages for (const DataPage& page : pages_) { if (page.header.data_page_header.num_values == 0) { // Skip empty pages location.offset = -1; location.compressed_page_size = 0; location.first_row_index = -1; AddLocationToOffsetIndex(location); continue; }
But -1 values may cause ComputeCandidatePages function run into unexpected status.
bool ComputeCandidatePages( const vector<parquet::PageLocation>& page_locations, const vector<RowRange>& candidate_ranges, const int64_t num_rows, vector<int>* candidate_pages) { if (!ValidatePageLocations(page_locations, num_rows)) return false
and then cause IMPALA-9952
Attachments
Issue Links
- is related to
-
IMPALA-9952 Invalid offset index in Parquet file
- Resolved