Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-25589

SQL: Implement HAVING/QUALIFY predicates for ROW_NUMBER()=1

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 4.0.0
    • None
    • CBO, SQL

    Description

      The insert queries which use a row_num()=1 function are inconvenient to write or port from an existing workload, because there is no easy way to ignore a column in this pattern.

      INSERT INTO main_table 
      SELECT * from duplicated_table
      QUALIFY ROW_NUMER() OVER (PARTITION BY event_id) = 1;
      

      needs to be rewritten into

      INSERT INTO main_table
      select event_id, event_ts, event_attribute, event_metric1, event_metric2, event_metric3, event_metric4, .., event_metric43 from 
      (select *, ROW_NUMBER() OVER (PARTITION BY event_id) as rnum from duplicated_table)
      where rnum=1;
      

      This is a time-consuming and error-prone rewrite (dealing with a messed up order of columns between one source and dest table).

      An alternate rewrite would be to do the same or similar syntax using HAVING.

      INSERT INTO main_table 
      SELECT * from duplicated_table
      HAVING ROW_NUMER() OVER (PARTITION BY event_id) = 1;
      

      Attachments

        Issue Links

          Activity

            People

              kkasa Krisztian Kasa
              gopalv Gopal Vijayaraghavan
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h