Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-28605

Performance regression in SS's foreach

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • 2.4.0, 2.4.1, 2.4.2, 2.4.3
    • None
    • Structured Streaming

    Description

      When "ForeachWriter.open" return "false", ForeachSink v1 will skip the whole partition without reading data. But in ForeachSink v2, due to the API limitation, it needs to read the whole partition even if all data just gets dropped.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              zsxwing Shixiong Zhu
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: