Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-31107

Extend FairScheduler to support pool level resource isolation

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • 3.1.0
    • None
    • Spark Core
    • None

    Description

      Currently, spark only provided two types of scheduler: FIFO & FAIR, but in sql high-concurrency scenarios, a few of drawbacks are exposed.

      FIFO: it can easily causing congestion when large sql query occupies all the resources

      FAIR: the taskSets of one pool may occupies all the resource due to there are no hard limit on the maximum usage for each pool. this case may be frequently met under high workloads.

      So we propose to add a maxShare argument for FairScheduler to control the maximum running tasks for each pool.

      One thing that needs our attention is that we should handle it well to make the `ExecutorAllocationManager` can release resources:
      e.g. Suppose we got 100 executors, if the tasks are scheduled on all executors with max concurrency 50, there are cases that the executors may not idle, and can not be released.

      One idea is to bind those executors to each pool, then we only schedule tasks on executors of the pool which it belongs to.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              liupengcheng liupengcheng
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: