Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-46912

Spark-submit in cluster mode with standalone cluster uses wrong JAVA_HOME path

    XMLWordPrintableJSON

Details

    Description

      When run spark submit to a standalone cluster using cluster mode, the worker machine will use the JAVA_HOME value from remote machine instead of from worker machine.

      To reproduce:

      • Create a standalone cluster using docker compose, set JAVA_HOME in each worker different from local machine.
      • Run spark-submit, deploy-mode cluster
      • Monitor the log from worker, the driver will print out: DriverRunner: Launch Command: "<value from local JAVA_HOME>" "-cp" ...

      Reason:

      When Master create a new driver in receiveAndReply method, it uses the environment variables from submitter to build the driver description command. After that, when launch the driver, a new local (of worker) is built but it still use environment variable from driver description (which came from submitter). The result is the building java command will use the submitter java home path instead of worker path.

      Suggestion:

      Replace JAVA_HOME and SPARK_HOME in buildLocalCommand method of 
      org.apache.spark.deploy.worker.CommandUtils by worker value

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              danhpham Danh Pham
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: