Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-23541

Vectorization: Unbounded following window function start producing results too early

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • 3.1.2, 4.0.0
    • None
    • None

    Description

      ReduceRecordSource indicates the end of group for a reducer input, whenever the entire key changes.

      ReduceRecordSource::processVectorGroup calls reducer.setNextVectorBatchGroupStatus(/* isLastGroupBatch */ true); when the last group is being processed.

      However for PTF window functions with unbounded following, this is triggered by the key changing and not the partition changing.

      This results in the VectorPTFOperator detect a change in the sort key as a switch of the partition key and start producing results too early.

      https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ptf/VectorPTFOperator.java#L399

      create temporary table test2(id STRING,name STRING,event_dt date) stored as orc;
      
      insert into test2 values ('100','A','2019-08-15'), ('100','A','2019-10-12');
      
      
      SELECT name, event_dt, first_value(event_dt) over (PARTITION BY name ORDER BY event_dt desc ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT_ROW) last_event_dt FROM test2; -- streaming FIRST_VALUE with DESCENDING
      
      SELECT name, event_dt, last_value(event_dt) over (PARTITION BY name ORDER BY event_dt asc ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) last_event_dt FROM test2; -- non-streaming LAST_VALUE with ASCENDING
      

      These two queries should return identical results, with the streaming version being significantly faster than the non-streaming one, due to the lack of buffered/spilled rows with streaming.

      Attachments

        1. HIVE-23541.1.patch
          19 kB
          Ramesh Kumar Thangarajan

        Activity

          People

            rameshkumar Ramesh Kumar Thangarajan
            gopalv Gopal Vijayaraghavan
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: