Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26755

Optimize Spark Scheduler to dequeue speculative tasks more efficiently

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.0.0
    • 3.0.0
    • Scheduler, Spark Core
    • None

    Description

      Currently, Spark Scheduler takes quite some time to dequeue speculative tasks for larger tasksets within a stage(like 100000 or more) when speculation is turned on. On further analysis, it was found that the "task-result-getter" threads remain blocked on one of the dispatcher-event-loop threads holding the lock on TaskSchedulerImpl object

      def resourceOffers(offers: IndexedSeq[WorkerOffer]): Seq[Seq[TaskDescription]] = synchronized {
      

      which takes quite some time to execute the method  "dequeueSpeculativeTask" in TaskSetManager.scala, thus, slowing down the overall running time of the spark job. We were monitoring the time utilization of that lock for the whole duration of the job and it was close to 50% i.e. the code within the synchronized block would run for almost half the duration of the entire spark job. The screenshots of the thread dump have been attached below for reference.

      Attachments

        1. Screen Shot 2019-01-28 at 11.22.42 AM.png
          451 kB
          Parth Gandhi
        2. Screen Shot 2019-01-28 at 11.21.25 AM.png
          780 kB
          Parth Gandhi
        3. Screen Shot 2019-01-28 at 11.21.05 AM.png
          563 kB
          Parth Gandhi

        Issue Links

          Activity

            People

              pgandhi Parth Gandhi
              pgandhi Parth Gandhi
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: