Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-24577

Spark submit fails with documentation example spark-pi

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 2.3.0, 2.3.1
    • None
    • Kubernetes, Spark Core
    • None

    Description

      The Spark-submit example in the K8s documentation fails for me.

      .\spark-submit.cmd --master k8s://https://my-k8s:8443
      --conf spark.kubernetes.namespace=my-namespace --deploy-mode cluster --name spark-pi --class org.apache.spark.examples.SparkPi
      --conf spark.executor.instances=5
      --conf spark.kubernetes.container.image=gcr.io/ynli-k8s/spark:v2.3.0
      --conf spark.kubernetes.driver.pod.name=spark-pi-driver local:///opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar
      

      Error in the driver log:

      ++ id -u
      + myuid=0
      ++ id -g
      + mygid=0
      ++ getent passwd 0
      + uidentry=root:x:0:0:root:/root:/bin/ash
      + '[' -z root:x:0:0:root:/root:/bin/ash ']'
      + SPARK_K8S_CMD=driver
      + '[' -z driver ']'
      + shift 1
      + SPARK_CLASSPATH=':/opt/spark/jars/*'
      + env
      + grep SPARK_JAVA_OPT_
      + sed 's/[^=]*=\(.*\)/\1/g'
      + readarray -t SPARK_JAVA_OPTS
      + '[' -n '/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar;/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar' ']'
      + SPARK_CLASSPATH=':/opt/spark/jars/*:/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar;/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar'
      + '[' -n '' ']'
      + case "$SPARK_K8S_CMD" in
      + CMD=(${JAVA_HOME}/bin/java "${SPARK_JAVA_OPTS[@]}" -cp "$SPARK_CLASSPATH" -Xms$SPARK_DRIVER_MEMORY -Xmx$SPARK_DRIVER_MEMORY -Dspark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS $SPARK_DRIVER_CLASS $SPARK_DRIVER_ARGS)
      + exec /sbin/tini -s -- /usr/lib/jvm/java-1.8-openjdk/bin/java -Dspark.kubernetes.namespace=my-namespace -Dspark.driver.port=7078 -Dspark.master=k8s://https://my-k8s:8443  -Dspark.jars=/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar,/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar -Dspark.driver.blockManager.port=7079 -Dspark.app.id=spark-311b7351345240fd89d6d86eaabdff6f -Dspark.kubernetes.driver.pod.name=spark-pi-driver -Dspark.executor.instances=5 -Dspark.app.name=spark-pi -Dspark.driver.host=spark-pi-ef6be7cac60a3f789f9714b2ebd1c68c-driver-svc.my-namespace.svc -Dspark.submit.deployMode=cluster -Dspark.kubernetes.executor.podNamePrefix=spark-pi-ef6be7cac60a3f789f9714b2ebd1c68c -Dspark.kubernetes.container.image=gcr.io/ynli-k8s/spark:v2.3.0 -cp ':/opt/spark/jars/*:/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar;/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar' -Xms1g -Xmx1g -Dspark.driver.bindAddress=172.101.1.40 org.apache.spark.examples.SparkPi
      Error: Could not find or load main class org.apache.spark.examples.SparkPi
      

      I am also using spark-operator to run the example and this one works for me. The spark-operator outputs its command to spark-submit:

       

      ++ id -u
      + myuid=0
      ++ id -g
      + mygid=0
      ++ getent passwd 0
      + uidentry=root:x:0:0:root:/root:/bin/ash
      + '[' -z root:x:0:0:root:/root:/bin/ash ']'
      + SPARK_K8S_CMD=driver
      + '[' -z driver ']'
      + shift 1
      + SPARK_CLASSPATH=':/opt/spark/jars/*'
      + env
      + grep SPARK_JAVA_OPT_
      + sed 's/[^=]*=\(.*\)/\1/g'
      + readarray -t SPARK_JAVA_OPTS
      + '[' -n /opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar:/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar ']'
      + SPARK_CLASSPATH=':/opt/spark/jars/*:/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar:/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar'
      + '[' -n '' ']'
      + case "$SPARK_K8S_CMD" in
      + CMD=(${JAVA_HOME}/bin/java "${SPARK_JAVA_OPTS[@]}" -cp "$SPARK_CLASSPATH" -Xms$SPARK_DRIVER_MEMORY -Xmx$SPARK_DRIVER_MEMORY
      -Dspark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS $SPARK_DRIVER_CLASS $SPARK_DRIVER_ARGS)
      + exec /sbin/tini -s -- /usr/lib/jvm/java-1.8-openjdk/bin/java
      -Dspark.kubernetes.driver.label.sparkoperator.k8s.io/app-id=spark-pi-2557211557
      -Dspark.kubernetes.container.image=gcr.io/ynli-k8s/spark:v2.3.0
      -Dspark.kubernetes.executor.label.sparkoperator.k8s.io/app-name=spark-pi
      -Dspark.app.name=spark-pi
      -Dspark.executor.instances=7
      -Dspark.driver.blockManager.port=7079
      -Dspark.driver.cores=0.100000
      -Dspark.kubernetes.driver.label.version=2.3.0
      -Dspark.kubernetes.executor.podNamePrefix=spark-pi-607e0943cf32319883cc3beb2e02be4f
      -Dspark.executor.memory=512m
      -Dspark.kubernetes.driver.label.sparkoperator.k8s.io/app-name=spark-pi
      -Dspark.kubernetes.authenticate.driver.serviceAccountName=spark
      -Dspark.kubernetes.driver.label.sparkoperator.k8s.io/launched-by-spark-operator=true
      -Dspark.kubernetes.driver.limit.cores=200m
      -Dspark.driver.host=spark-pi-607e0943cf32319883cc3beb2e02be4f
      -Driver-svc.big
      -Data-analytics.svc
      -Dspark.kubernetes.driver.pod.name=spark-pi-607e0943cf32319883cc3beb2e02be4f
      -Driver
      -Dspark.submit.deployMode=cluster
      -Dspark.kubernetes.executor.label.sparkoperator.k8s.io/app-id=spark-pi-2557211557
      -Dspark.kubernetes.submission.waitAppCompletion=false
      -Dspark.kubernetes.driver.annotation.sparkoperator.k8s.io/volumes.test-volume=Cgt0ZXN0LXZvbHVtZRITChEKBC90bXASCURpcmVjdG9yeQ==
      -Dspark.driver.port=7078
      -Dspark.app.id=spark-a7cdcb5ce1e54879a5286979a197f791
      -Dspark.kubernetes.executor.label.sparkoperator.k8s.io/launched-by-spark-operator=true
      -Dspark.driver.memory=512m
      -Dspark.kubernetes.executor.label.version=2.3.0
      -Dspark.kubernetes.executor.annotation.sparkoperator.k8s.io/volumemounts.test-volume=Cgt0ZXN0LXZvbHVtZRAAGgQvdG1wIgA=
      -Dspark.kubernetes.executor.annotation.sparkoperator.k8s.io/volumes.test-volume=Cgt0ZXN0LXZvbHVtZRITChEKBC90bXASCURpcmVjdG9yeQ==
      -Dspark.kubernetes.driver.annotation.sparkoperator.k8s.io/ownerreference=ChBTcGFya0FwcGxpY2F0aW9uGghzcGFyay1waSIkZjkwZTBlN2YtNzJlNy0xMWU4LTk2ZmEtMDAxYTRhMTYwMTU1Kh1zcGFya29wZXJhdG9yLms4cy5pby92MWFscGhhMQ==
      -Dspark.master=k8s://https://172.30.0.1:443
      

      The only obvious thing I see is that the code that fails has this line:

      + '[' -n '/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar;/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar' ']'
      

      While the one that is working has this line

      + '[' -n /opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar:/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar ']'
      

      Note how there is a semicolon instead of a colon and the string is escaped.

       

      I am using Windows with spark-submit (Version spark-2.3.1-bin-hadoop2.7)

      Kubernetes 1.9.1

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              Kuku1 Kuku1
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: