Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26214

Add "broadcast" method to DataFrame

    XMLWordPrintableJSON

Details

    • Wish
    • Status: Resolved
    • Trivial
    • Resolution: Won't Fix
    • 2.4.0
    • None
    • SQL

    Description

      As discussed at https://stackoverflow.com/questions/43984068/does-spark-sql-autobroadcastjointhreshold-work-for-joins-using-datasets-join-op/43994022, it's possible to force broadcast of DataFrame, even if total size is greater than ``spark.sql.autoBroadcastJoinThreshold``.

      But this not trivial for beginner, because there is no "broadcast" method (I know, I am lazy ...).

      We could add this method, with a WARN if size is greater than the threshold.

      (if it's an easy one, I could do it?)

      Attachments

        Activity

          People

            Unassigned Unassigned
            ebuildy Thomas Decaux
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: