Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-11742

last_value window specifier enforces ordering as a partition

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • PTF-Windowing
    • None

    Description

      HIVE-4262 changed the partitioning behavior of the last_value function. For a specified last_value() OVER X. The ordering spec within X is used in addition to the partition spec for partitioning. i.e. last_value(a) OVER (PARTITION BY i ORDER BY j) operates last_value(a) on all rows within the unique combination of (i,j). The behavior I'd expect is for PARTITION BY i to define the partitioning, and ORDER BY to define the ordering within the PARTITION. i.e. last_value(a) OVER (PARTITION BY i ORDER BY j) should operate last_value(a) on all rows within the unique values of , ordered by j within the partition.

      This was changed to be consistent with how SQLServer handled such queries. SQLServer Docs describe their example (which performs as Hive does):

      The PARTITION BY clause partitions the employees by department and the LAST_VALUE function is applied to each partition independently. The ORDER BY clause specified in the OVER clause determines the logical order in which the LAST_VALUE function is applied to the rows in each partition.

      To me, their behavior is inconsistent with their description. I've filled an upstream bug with Microsoft for the same.

      Oracle and Redshift both exhibit the behavior I'd expect.

      Considering Hive-4262 has been in core for 2+ years, I don't think we can change the behavior without potentially impacting clients. But I would like a way to enable the expected behavior at the least (behind a config flag maybe?). What do you think?

      Attachments

        Activity

          People

            prateekrungta Prateek Rungta
            prateekrungta Prateek Rungta
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: