Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-38965

Optimize RemoteBlockPushResolver with a memory pool

    XMLWordPrintableJSON

Details

    • Bug
    • Status: In Progress
    • Minor
    • Resolution: Unresolved
    • 3.3.0
    • None
    • Shuffle
    • None

    Description

      For push-based shuffle service, there are many BLOCK_APPEND_COLLISION_DETECTED when there are many small map tasks outputs. In RemoteBlockPushResolver, if one map task pushed blocks is writing, the others map tasks pushed blocks will failed in onComplete() method.
      And RemoteBlockPushResolver has no memory limit , so many executors will OOM when there are many small pushed blocks waiting to be written to the final data file.

      Attachments

        Activity

          People

            Unassigned Unassigned
            wankun Wan Kun
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: