Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-28605

Performance regression in SS's foreach

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Invalid
    • Affects Version/s: 2.4.0, 2.4.1, 2.4.2, 2.4.3
    • Fix Version/s: None
    • Component/s: Structured Streaming
    • Labels:

      Description

      When "ForeachWriter.open" return "false", ForeachSink v1 will skip the whole partition without reading data. But in ForeachSink v2, due to the API limitation, it needs to read the whole partition even if all data just gets dropped.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                zsxwing Shixiong Zhu
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: