Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.5.0
-
None
Description
When run spark submit to a standalone cluster using cluster mode, the worker machine will use the JAVA_HOME value from remote machine instead of from worker machine.
To reproduce:
- Create a standalone cluster using docker compose, set JAVA_HOME in each worker different from local machine.
- Run spark-submit, deploy-mode cluster
- Monitor the log from worker, the driver will print out: DriverRunner: Launch Command: "<value from local JAVA_HOME>" "-cp" ...
Reason:
When Master create a new driver in receiveAndReply method, it uses the environment variables from submitter to build the driver description command. After that, when launch the driver, a new local (of worker) is built but it still use environment variable from driver description (which came from submitter). The result is the building java command will use the submitter java home path instead of worker path.
Suggestion:
Replace JAVA_HOME and SPARK_HOME in buildLocalCommand method ofÂ
org.apache.spark.deploy.worker.CommandUtils by worker value
Attachments
Issue Links
- links to