Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-42034

QueryExecutionListener and Observation API, df.observe do not work with `foreach` action.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.1.3, 3.2.2, 3.3.1
    • 3.5.0
    • SQL
    • I test it locally and on YARN in cluster mode.
      Spark 3.3.1 and 3.2.2 and 3.1.1.
      Yarn 2.9.2 and 3.2.1.

    Description

      Observation API, observe dataframe transformation, and custom QueryExecutionListener.
      Do not work with foreach or foreachPartition actions.
      This is due to }}QueryExecutionListener functions do not trigger on queries whose action is {{foreach or foreachPartition.
      But the Spark GUI SQL tab sees this query as SQL query and shows its query plans and etc.

      here is the code to reproduce it:
      https://gist.github.com/GrigorievNick/e7cf9ec5584b417d9719e2812722e6d3

      Attachments

        Activity

          People

            Zing zzzzming95
            hryhoriev.nick Nick Hryhoriev
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: