Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-25129

Wrong results when timestamps stored in Avro/Parquet fall into the DST shift

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.1.2
    • None
    • None

    Description

      Timestamp values falling into the daylight savings time of the system timezone cannot be retrieved as is when those are stored in Parquet/Avro tables. The respective SELECT query shifts those timestamps by +1 reflecting the DST shift.

      Example

      --! qt:timezone:US/Pacific
      
      create table employee (eid int, birthdate timestamp) stored as parquet;
      
      insert into employee values (0, '2019-03-10 02:00:00');
      insert into employee values (1, '2020-03-08 02:00:00');
      insert into employee values (2, '2021-03-14 02:00:00');
      
      select eid, birthdate from employee order by eid;

      Actual results

      0 2019-03-10 03:00:00
      1 2020-03-08 03:00:00
      2 2021-03-14 03:00:00

      Expected results

      0 2019-03-10 02:00:00
      1 2020-03-08 02:00:00
      2 2021-03-14 02:00:00

      Storing and retrieving values in columns using the timestamp data type (equivalent with LocalDateTime java API) should not alter at any way the value that the user is seeing. The results are correct for TEXTFILE and ORC tables.

      Attachments

        1. parquet_timestamp_dst.q
          0.3 kB
          Stamatis Zampetakis

        Issue Links

          Activity

            People

              zabetak Stamatis Zampetakis
              zabetak Stamatis Zampetakis
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: