Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.3.0
-
The problem occurs in Windows if spark-shell is called from a bash session.
NOTE: the fix also applies to spark-submit and and beeline, since they call spark-shell.
-
Important
-
38228
Description
A spark pull request spark PR fixes this issue, and also fixes a build error that is also related to cygwin and msys/mingw bash sbt sessions.
If a Windows user tries to start a spark-shell session by calling the bash script (rather than the spark-shell.cmd script), it fails with a confusing error message. Script spark-class calls launcher/src/main/java/org/apache/spark/launcher/Main.java to generate command line arguments, but the launcher produces a format appropriate to the .cmd version of the script rather than the bash version.
The launcher Main method, when called for environments other than Windows, interleaves NULL characters between the command line arguments. It should also do so in Windows when called from the bash script. It incorrectly assumes that if the OS is Windows, that it is being called by the .cmd version of the script.
The resulting error message is unhelpful:
[lots of ugly stuff omitted] /opt/spark/bin/spark-class: line 100: CMD: bad array subscript
The key to launcher/Main knowing that a request is from a bash session is that the SHELL environment variable is set. This will normally be set in any of the various Windows shell environments (cygwin, mingw64, msys2, etc) and will not normally be set in Windows environments. In the spark-class.cmd script, SHELL is intentionally unset to avoid problems, and to permit bash users to call the .cmd scripts if they prefer (it will still work as before).