Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-9228

Problem with subquery using windowing functions

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.13.1, 0.14.0, 1.0.0
    • 1.0.2
    • PTF-Windowing
    • None

    Description

      The following query with window functions failed. The internal query works fine.

      select col1, col2, col3 from (select col1,col2, col3, count(case when col4=1 then 1 end ) over (partition by col1, col2) as col5, row_number() over (partition by col1, col2 order by col4) as col6 from tab1) t;

      HIVE generates an execution plan with 2 jobs.
      1. The first job is to basically calculate window function for col5.
      2. The second job is to calculate window function for col6 and output.

      The plan says the first job outputs the columns (col1, col2, col3, col4) to a tmp file since only these columns are used in later stage. While, the PTF operator for the first job outputs (_wcol0, col1, col2, col3, col4) with _wcol0 as the result of the window function even it's not used.

      In the second job, the map operator still reads the 4 columns (col1, col2, col3, col4) from the temp file using the plan. That causes the exception.

      Attachments

        1. tab1.csv
          0.0 kB
          Aihua Xu
        2. HIVE-9228.3.patch.txt
          22 kB
          Navis Ryu
        3. HIVE-9228.2.patch.txt
          18 kB
          Navis Ryu
        4. HIVE-9228.1.patch.txt
          7 kB
          Navis Ryu
        5. create_table_tab1.sql
          0.2 kB
          Aihua Xu

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            navis Navis Ryu Assign to me
            aihuaxu Aihua Xu
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 96h
              96h
              Remaining:
              Remaining Estimate - 96h
              96h
              Logged:
              Time Spent - Not Specified
              Not Specified

              Slack

                Issue deployment