Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-21690

Support outer joins with HiveAggregateJoinTransposeRule and turn it on by default

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsAdd voteVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Query Planning
    • None

    Description

      1) This optimization is off by default. We would like to turn on this optimization wherein group by is pushed down to join, in some cases top aggregate is removed but in most of the cases this optimization adds extra aggregate nodes. To measure if those extra aggregates are beneficial or not (they might add extra overhead without reducing rows) cost is computed and compared b/w previous plan and new plan.

      Since Hive's cost model only consider JOIN's cost and discard cost of rest of the nodes, this comparison always favor new plan (since adding aggregate beneath join reduces the total number of rows processed by the join and therefore reduces the join cost). Therefore turning on this optimization with existing cost model is not a good idea.

      One approach to fix this is to localize the cost computation to the rule itself, i.e compute the non-cumulative cost of existing aggregate and join and compare it with new cost of new aggregates, join and top aggregate.

      Better approach in my opinion would be to fix the cost model and take aggregate cost into account (along with the join). This could affect other queries and can cause performance regression but those will most likely be issues with the planning and should be investigated and fixed.

      2) This optimization currently only support INNER JOIN. This can be extended to support OUTER joins.

       

      cc [~jcamachorodriguez] Ashutosh Chauhan Gopal Vijayaraghavan

      Attachments

        1. HIVE-21690.1.patch
          2 kB
          Vineet Garg
        2. HIVE-21690.2.patch
          17 kB
          Vineet Garg
        3. HIVE-21690.3.patch
          20 kB
          Vineet Garg

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            vgarg Vineet Garg Assign to me
            vgarg Vineet Garg

            Dates

              Created:
              Updated:

              Slack

                Issue deployment