Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-29287

Executors should not receive any offers before they are actually constructed

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Done
    • 3.0.0
    • 3.0.0
    • Spark Core
    • None

    Description

      The executors send RegisterExecutor messages to the driver when onStart.

      The driver put the executor data in “the ready to serve map” if it could be, then send RegisteredExecutor back to the executor.  The driver now can make an offer to this executor.

      But the executor is not fully constructed yet. When it received RegisteredExecutor, it start to construct itself, initializing block manager, maybe register to the local shuffle server in the way of retrying, then start the heart beating to driver ... 

      The task allocated here may fail if the executor fails to start or cannot get heart beating to the driver in time.

      Sometimes, even worse, when dynamic allocation and blacklisting is enabled and when the runtime executor number down to min executor setting, and those executors receive tasks before fully constructed and if any error happens, the application may be blocked or tear down. 

       

      Attachments

        Issue Links

          Activity

            People

              Qin Yao Kent Yao 2
              Qin Yao Kent Yao 2
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: