Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-46343

Spark cannot support Docker bridge network in YARN

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 4.0.0, 3.5.1
    • None
    • YARN
    • None
    • OS: Ubuntu 22.04.2 LTS

      JDK Version: 1.8

      Hadoop Version: 3.3.6

      Spark Version: 3.5.1

    Description

      Hello Spark team,

      I recently found a possible bug in Spark YarnAllocator.

      Basically when I try to run Spark applications on YARN with Docker bridge network, the job failed with binding address error at Executor side.

      I believe it is caused by the YarnAllocator implementation in Spark, the executor is trying to bind the hostname of the NodeManager instead of the hostname of the container. In host network it's fine but bridge network will break.

      For more details please checkout RCA - Spark + YARN Docker Bridge Network.

      It looks like YARN Container API does not return the container hostname related information, which mean to solve this issue, we may also need to make changes at Hadoop YARN side?

       

      Please let me know if you have any questions, many thanks!

      Best Regards,

      Jingwei Zhang

      Attachments

        1. spark-yarn-failure-rca.png
          157 kB
          Jingwei (Sophie) Zhang

        Activity

          People

            Unassigned Unassigned
            jwz16 Jingwei (Sophie) Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: