Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-27199

Read TIMESTAMP WITH LOCAL TIME ZONE columns from text files using custom formats

    XMLWordPrintableJSON

Details

    Description

      Timestamp values come in many flavors and formats and there is no single representation that can satisfy everyone especially when such values are stored in plain text/csv files.

      HIVE-9298, added a special SERDE property, timestamp.formats, that allows to provide custom timestamp patterns to parse correctly TIMESTAMP values coming from files.

      However, when the column type is TIMESTAMP WITH LOCAL TIME ZONE (LTZ) it is not possible to use a custom pattern thus when the built-in Hive parser does not match the expected format a NULL value is returned.

      Consider a text file, F1, with the following values:

      2016-05-03 12:26:34
      2016-05-03T12:26:34
      

      and a table with a column declared as LTZ.

      CREATE TABLE ts_table (ts TIMESTAMP WITH LOCAL TIME ZONE);
      LOAD DATA LOCAL INPATH './F1' INTO TABLE ts_table;
      
      SELECT * FROM ts_table;
      2016-05-03 12:26:34.0 US/Pacific
      NULL
      

      In order to give more flexibility to the users relying on the TIMESTAMP WITH LOCAL TIME ZONE datatype and also align the behavior with the TIMESTAMP type this JIRA aims to reuse the timestamp.formats property for both TIMESTAMP types.

      The work here focuses exclusively on simple text files but the same could be done for other SERDE such as JSON etc.

      Attachments

        Activity

          People

            zabetak Stamatis Zampetakis
            zabetak Stamatis Zampetakis
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 40m
                1h 40m