Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-44951

Improve Spark Dynamic Allocation

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 4.0.0
    • None
    • Kubernetes, Spark Core, YARN
    • None

    Description

      For Spark 4 we should aim to improve Spark's dynamic allocation. Some potential ideas here includes the following:

      • Plug-gable DEA algorithms
      • How to reduce wastage on the RM side? Sometimes the driver asks for some units of resources. But when RM provisions them, the driver cancels it. 
      • Support for "warm" executor pools which are not tied to a particular driver but start and wait for a driver to connect to them to "claim" them.
      • More explicit Cost Vs AppRunTime confiugration: A good DEA algo should allow the developer to choose between cost and runtime. Sometimes developers might be ok to pay higher costs for faster execution.
      • Use previous run information to inform future runs
      • Better selection of executors to be scaled down

      Attachments

        Activity

          People

            Unassigned Unassigned
            holden Holden Karau
            Holden Karau Holden Karau
            Votes:
            3 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated: