Details
-
Sub-task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
Impala 4.3.0
-
None
-
None
-
ghx-label-10
Description
Currently Impala cannot read BINARY columns in JSON files written by Hive correctly and returns runtime errors:
select * from functional_json.binary_tbl; +----+--------------+------------+ | id | string_col | binary_col | +----+--------------+------------+ | 1 | ascii | NULL | | 2 | ascii | NULL | | 3 | null | NULL | | 4 | empty | | | 5 | valid utf8 | NULL | | 6 | valid utf8 | NULL | | 7 | invalid utf8 | NULL | | 8 | invalid utf8 | NULL | +----+--------------+------------+ WARNINGS: Error converting column: functional_json.binary_tbl.binary_col, type: STRING, data: 'binary1' Error parsing row: file: hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 481 Error converting column: functional_json.binary_tbl.binary_col, type: STRING, data: 'binary2' Error parsing row: file: hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 481 Error converting column: functional_json.binary_tbl.binary_col, type: STRING, data: 'árvíztűrőtükörfúró' Error parsing row: file: hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 481 Error converting column: functional_json.binary_tbl.binary_col, type: STRING, data: '你好hello' Error parsing row: file: hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 481 Error converting column: functional_json.binary_tbl.binary_col, type: STRING, data: '��' Error parsing row: file: hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 481 Error converting column: functional_json.binary_tbl.binary_col, type: STRING, data: '�D3"' Error parsing row: file: hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 481
The single file in the table looks like this:
hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0 {"id":1,"string_col":"ascii","binary_col":"binary1"} {"id":2,"string_col":"ascii","binary_col":"binary2"} {"id":3,"string_col":"null","binary_col":null} {"id":4,"string_col":"empty","binary_col":""} {"id":5,"string_col":"valid utf8","binary_col":"árvíztűrőtükörfúró"} {"id":6,"string_col":"valid utf8","binary_col":"你好hello"} {"id":7,"string_col":"invalid utf8","binary_col":"\u0000�\u0000�"} {"id":8,"string_col":"invalid utf8","binary_col":"�D3\"\u0011\u0000"}