Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6969

Inconsistent results when reading MaprDB JSON tables using hive plugin when native reader is enabled

    XMLWordPrintableJSON

Details

    Description

      Steps to reproduce:
      0. Set PST timezone.
      1. Create the table in MaprDB shell:

      create /tmp/testtimestamp
      insert /tmp/testtimestamp --value '{"_id":"1","datestring":"2018-01-01 12:12:12.123","datetimestamp":{"$date":"2018-01-01T20:12:12.123Z"}}'
      insert /tmp/testtimestamp --value '{"_id":"2","datestring":"9999-12-31 23:59:59.999","datetimestamp":{"$date":"10000-01-01T07:59:59.999Z"}}'
      

      2. Create a hive table:

      create external table `testtimestamp` (`_id` string, datestring string, datetimestamp timestamp)
      ROW FORMAT SERDE 'org.apache.hadoop.hive.maprdb.json.serde.MapRDBSerDe'
      STORED BY 'org.apache.hadoop.hive.maprdb.json.MapRDBJsonStorageHandler'
      TBLPROPERTIES ( 'maprdb.column.id'='_id', 'maprdb.table.name'='/tmp/testtimestamp');
      

      3. Disable native reader and run the query on the table from Drill using hive plugin:

      alter session set store.hive.maprdb_json.optimize_scan_with_native_reader=false;
      select * from hive.testtimestamp;
      

      It returns:

      +------+--------------------------+--------------------------+
      | _id  |        datestring        |      datetimestamp       |
      +------+--------------------------+--------------------------+
      | 1    | 2018-01-01 12:12:12.123  | 2018-01-01 12:12:12.123  |
      | 2    | 9999-12-31 23:59:59.999  | 9999-12-31 23:59:59.999  |
      +------+--------------------------+--------------------------+
      

      4. Enable native reader and run the query on the same table:

      alter session set store.hive.maprdb_json.optimize_scan_with_native_reader=true;
      select * from hive.testtimestamp;
      

      It returns:

      +------+--------------------------+---------------------------+
      | _id  |        datestring        |       datetimestamp       |
      +------+--------------------------+---------------------------+
      | 1    | 2018-01-01 12:12:12.123  | 2018-01-01 20:12:12.123   |
      | 2    | 9999-12-31 23:59:59.999  | 10000-01-01 07:59:59.999  |
      +------+--------------------------+---------------------------+
      

      For documentation:

      Added the following Mapr-DB Format Setting:

      Option Description Value
      readTimestampWithZoneOffset When enabled, Drill converts timestamp values read form MapR Database from UTC to local timezone. Disabled by default. true|false

      Added the following configuration option:

      Name Default Description
      store.hive.maprdb_json.read_timestamp_with_timezone_offset FALSE Enables Drill to read timestamp values with timezone offset when hive plugin is used and Drill native MaprDB JSON reader usage is enabled. (Drill 1.16+)

      Attachments

        Issue Links

          Activity

            People

              volodymyr Vova Vysotskyi
              volodymyr Vova Vysotskyi
              Aman Sinha Aman Sinha
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: