Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-41550 Dynamic Allocation on K8S GA
  3. SPARK-40481

Ignore stage fetch failure caused by decommissioned executor

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.4.0
    • 3.4.0
    • Spark Core
    • None

    Description

      When executor decommission is enabled, there would be many stage failure caused by FetchFailed from decommissioned executor, further causing whole job's failure. It would be better not to count such failure in `spark.stage.maxConsecutiveAttempts`

      Attachments

        Activity

          People

            warrenzhu25 Zhongwei Zhu
            warrenzhu25 Zhongwei Zhu
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: