Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-10585

Range based Windowing is handled incorrectly for String types

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • PTF-Windowing
    • None

    Description

      Thanks to yhuai for pointing this out.
      I think the thought for ordinal datatypes (like string) was to measure distance as the number of changed values. So 2 preceding would mean go back until you have reach the 2nd different value from the value in the 'current' row.

      But this is not the way it is implemented. StringValueBoundaryScanner simply ignores the preceding amount.

      Here is an example from windowing.q that is not handled correctly

      -- 31. testWindowCrossReference
      select p_mfgr, p_name, p_size, 
      sum(p_size) over w1 as s1, 
      sum(p_size) over w2 as s2
      from part 
      window w1 as (partition by p_mfgr order by p_name range between 2 preceding and 2 following), 
             w2 as w1;
      

      Attachments

        Activity

          People

            ctang Chaoyu Tang
            rhbutani Harish Butani
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: