Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-10725 Support JSON format tables
  3. IMPALA-12927

Support reading BINARY columns in JSON tables

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • Impala 4.3.0
    • None
    • Backend
    • None
    • ghx-label-10

    Description

      Currently Impala cannot read BINARY columns in JSON files written by Hive correctly and returns runtime errors:

      
      select * from functional_json.binary_tbl;
      +----+--------------+------------+
      | id | string_col   | binary_col |
      +----+--------------+------------+
      | 1  | ascii        | NULL       |
      | 2  | ascii        | NULL       |
      | 3  | null         | NULL       |
      | 4  | empty        |            |
      | 5  | valid utf8   | NULL       |
      | 6  | valid utf8   | NULL       |
      | 7  | invalid utf8 | NULL       |
      | 8  | invalid utf8 | NULL       |
      +----+--------------+------------+
      WARNINGS: Error converting column: functional_json.binary_tbl.binary_col, type: STRING, data: 'binary1'
      Error parsing row: file: hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 481
      Error converting column: functional_json.binary_tbl.binary_col, type: STRING, data: 'binary2'
      Error parsing row: file: hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 481
      Error converting column: functional_json.binary_tbl.binary_col, type: STRING, data: 'árvíztűrőtükörfúró'
      Error parsing row: file: hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 481
      Error converting column: functional_json.binary_tbl.binary_col, type: STRING, data: '你好hello'
      Error parsing row: file: hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 481
      Error converting column: functional_json.binary_tbl.binary_col, type: STRING, data: '��'
      Error parsing row: file: hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 481
      Error converting column: functional_json.binary_tbl.binary_col, type: STRING, data: '�D3"'
      Error parsing row: file: hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0, before offset: 481
      
      

      The single file in the table looks like this:

      
       hdfs://localhost:20500/test-warehouse/binary_tbl_json/000000_0
      
      {"id":1,"string_col":"ascii","binary_col":"binary1"}
      {"id":2,"string_col":"ascii","binary_col":"binary2"}
      {"id":3,"string_col":"null","binary_col":null}
      {"id":4,"string_col":"empty","binary_col":""}
      {"id":5,"string_col":"valid utf8","binary_col":"árvíztűrőtükörfúró"}
      {"id":6,"string_col":"valid utf8","binary_col":"你好hello"}
      {"id":7,"string_col":"invalid utf8","binary_col":"\u0000�\u0000�"}
      {"id":8,"string_col":"invalid utf8","binary_col":"�D3\"\u0011\u0000"}
      
      

       

      Attachments

        Activity

          People

            eyizoha Zihao Ye
            csringhofer Csaba Ringhofer
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: