Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-28103

The StandardStructObjectInspector for JSON SerDe is forcefully converting field names to lowercase.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 4.0.0-beta-1
    • None
    • None

    Description

      When using the  org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector class, it appears that field names are being automatically converted to lowercase in the following code snippet:
       
      ```
      this.fieldName = fieldName.toLowerCase();
      ```
       
      This behavior subsequently causes issues when querying JSON formatted tables, particularly when nested Struct field names within the JSON data contain a mix of uppercase and lowercase characters. Since field names are being changed to lowercase by the StandardStructObjectInspector class, the actual field names no longer match the expected field names, which leads to errors when reading the data.
       
      Sql to reply this behavior:
      ```
      – create json table, the `struct<MD5:string>` will become to lower case:`struct<md5:string>`.
      CREATE TABLE `test.hive_json_struct_schema`(
        `cond_keys` struct<MD5:string>
      )
      ROW FORMAT SERDE
        'org.apache.hive.hcatalog.data.JsonSerDe'
      STORED AS INPUTFORMAT
        'org.apache.hadoop.mapred.TextInputFormat'
      OUTPUTFORMAT
        'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
      ```

      Attachments

        Activity

          People

            Unassigned Unassigned
            chang.wd 常伟东
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: