Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-30598

Detect equijoins better

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: In Progress
    • Minor
    • Resolution: Unresolved
    • 3.1.0
    • None
    • SQL
    • None

    Description

      The following 2 query produce different plans, as the second one is not recognised as equijoin.

      SELECT * FROM t1 FULL OUTER JOIN t2 ON t1.c2 = 2 AND t2.c2 = 2 AND t1.c = t2.c
      
      SortMergeJoin [c#225], [c#236], FullOuter, ((c2#226 = 2) AND (c2#237 = 2))
      :- *(2) Sort [c#225 ASC NULLS FIRST], false, 0
      :  +- Exchange hashpartitioning(c#225, 5), true, [id=#101]
      :     +- *(1) Project [_1#220 AS c#225, _2#221 AS c2#226]
      :        +- *(1) LocalTableScan [_1#220, _2#221]
      +- *(4) Sort [c#236 ASC NULLS FIRST], false, 0
         +- Exchange hashpartitioning(c#236, 5), true, [id=#106]
            +- *(3) Project [_1#231 AS c#236, _2#232 AS c2#237]
               +- *(3) LocalTableScan [_1#231, _2#232]
      
      SELECT * FROM t1 FULL OUTER JOIN t2 ON t1.c2 = 2 AND t2.c2 = 2
      
      BroadcastNestedLoopJoin BuildRight, FullOuter, ((c2#226 = 2) AND (c2#237 = 2))
      :- *(1) Project [_1#220 AS c#225, _2#221 AS c2#226]
      :  +- *(1) LocalTableScan [_1#220, _2#221]
      +- BroadcastExchange IdentityBroadcastMode, [id=#146]
         +- *(2) Project [_1#231 AS c#236, _2#232 AS c2#237]
            +- *(2) LocalTableScan [_1#231, _2#232]
      

      We could detect the implicit equalities from the join condition.

      Attachments

        Issue Links

          Activity

            People

              apachespark Apache Spark
              petertoth Peter Toth
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: