Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 2.5.0, Impala 2.2.10, Impala 2.3.4
-
None
-
ghx-label-7
Description
The flag '-use_local_tz_for_unix_timestamp_conversions' was added for IMPALA-97. Enabling it results in timestamps sometimes being converted into localtime, but unfortunately this doesn't seem to be well defined when/where this conversion will happen.
I've noticed that its use seems to break the avg() aggregate function on timestamp types (despite being an odd function on timestamps, it should still work).
Impala by default, i.e. not enabling this flag:
[localhost:21000] > select timestamp_col from functional.alltypestiny; Query: select timestamp_col from functional.alltypestiny Query submitted at: 2017-03-27 18:50:57 (Coordinator: http://mj-desktop.ca.cloudera.com:25000) Query progress can be monitored at: http://mj-desktop.ca.cloudera.com:25000/query_plan?query_id=8242bb6012948f06:143961ed00000000 +---------------------+ | timestamp_col | +---------------------+ | 2009-01-01 00:00:00 | | 2009-01-01 00:01:00 | | 2009-02-01 00:00:00 | | 2009-02-01 00:01:00 | | 2009-03-01 00:00:00 | | 2009-03-01 00:01:00 | | 2009-04-01 00:00:00 | | 2009-04-01 00:01:00 | +---------------------+ Fetched 8 row(s) in 0.02s [localhost:21000] > select avg(timestamp_col) from functional.alltypestiny; Query: select avg(timestamp_col) from functional.alltypestiny Query submitted at: 2017-03-27 18:50:59 (Coordinator: http://mj-desktop.ca.cloudera.com:25000) Query progress can be monitored at: http://mj-desktop.ca.cloudera.com:25000/query_plan?query_id=534f6ab59b201b5e:40e2a86d00000000 +---------------------+ | avg(timestamp_col) | +---------------------+ | 2009-02-14 23:45:30 | +---------------------+
Then enabling the flag results in the same timestamps returned when scanning, but evaluating them in avg() results in them being converted:
[localhost:21000] > select timestamp_col from functional.alltypestiny; Query: select timestamp_col from functional.alltypestiny Query submitted at: 2017-03-27 18:51:17 (Coordinator: http://mj-desktop.ca.cloudera.com:25000) Query progress can be monitored at: http://mj-desktop.ca.cloudera.com:25000/query_plan?query_id=ac4ab8fd8caf4be9:ebb0834d00000000 +---------------------+ | timestamp_col | +---------------------+ | 2009-01-01 00:00:00 | | 2009-01-01 00:01:00 | | 2009-02-01 00:00:00 | | 2009-02-01 00:01:00 | | 2009-03-01 00:00:00 | | 2009-03-01 00:01:00 | | 2009-04-01 00:00:00 | | 2009-04-01 00:01:00 | +---------------------+ Fetched 8 row(s) in 0.30s [localhost:21000] > select avg(timestamp_col) from functional.alltypestiny; Query: select avg(timestamp_col) from functional.alltypestiny Query submitted at: 2017-03-27 18:51:25 (Coordinator: http://mj-desktop.ca.cloudera.com:25000) Query progress can be monitored at: http://mj-desktop.ca.cloudera.com:25000/query_plan?query_id=9e4e2c16896090f7:8922c4f200000000 +---------------------+ | avg(timestamp_col) | +---------------------+ | 2009-02-15 00:00:30 | +---------------------+ Fetched 1 row(s) in 0.12s
This behavior seems inconsistent and I'm pretty sure is not intentional. There are two misleading functions on TimestampValue that will do this conversion when the flag is set: ToUnixTime() and ToSubsecondUnixTime(). avg() seems to have started using ToSubsecondUnixTime() after IMPALA-2914.
Attachments
Issue Links
- relates to
-
IMPALA-7472 Consider removing TimestampValue::FromSubsecondUnixTime
- Open