Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-30105

Add StreamingQueryListener support to PySpark

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.1.0
    • None
    • None

    Description

      Add support for StreamingQueryListener to PySpark.

       

      Currently the `StreamingQueryListener` in Scala is implemented as an abstract class, so we cannot use Python proxies (Py4j) to access it unless we create our own custom Scala/Java wrapper.

       

      https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryListener.scala

       

      This would be very useful in my personal case, I am building a library that allows you to send Python errors to Sentry.io https://docs.sentry.io/platforms/python/pyspark/ and would like to hook onto onQueryTerminated to send errors.

       

      I can take this on if you point me in which direction to go, new to the codebase so not quite sure what the process for porting Scala API -> PySpark API changes usually look like.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              aprasad Abhijeet Prasad
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: