Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-25231

Running a Large Job with Speculation On Causes Executor Heartbeats to Time Out on Driver

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.3.1
    • 2.3.2, 2.4.0
    • Scheduler, Spark Core
    • None

    Description

      Running a large Spark job with speculation turned on was causing executor heartbeats to time out on the driver end after sometime and eventually, after hitting the max number of executor failures, the job would fail. 

      Attachments

        Activity

          People

            pgandhi Parth Gandhi
            pgandhi Parth Gandhi
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: