Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-27491

SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty response! therefore Airflow won't integrate with Spark 2.3.x

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      This issue must have been introduced after Spark 2.1.1 as it is working in that version. This issue is affecting me in Spark 2.3.3/2.3.0. I am using spark standalone mode if that makes a difference.

      See below spark 2.3.3 returns empty response while 2.1.1 returns a response.

       

      Spark 2.1.1:

      [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home1/bin/spark-class org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status driver-20190417130324-0009
      + export SPARK_HOME=/home/ec2here/spark_home1
      + SPARK_HOME=/home/ec2here/spark_home1
      + '[' -z /home/ec2here/spark_home1 ']'
      + . /home/ec2here/spark_home1/bin/load-spark-env.sh
      ++ '[' -z /home/ec2here/spark_home1 ']'
      ++ '[' -z '' ']'
      ++ export SPARK_ENV_LOADED=1
      ++ SPARK_ENV_LOADED=1
      ++ parent_dir=/home/ec2here/spark_home1
      ++ user_conf_dir=/home/ec2here/spark_home1/conf
      ++ '[' -f /home/ec2here/spark_home1/conf/spark-env.sh ']'
      ++ set -a
      ++ . /home/ec2here/spark_home1/conf/spark-env.sh
      +++ export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
      +++ JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
      ++++ ulimit -n 1048576
      ++ set +a
      ++ '[' -z '' ']'
      ++ ASSEMBLY_DIR2=/home/ec2here/spark_home1/assembly/target/scala-2.11
      ++ ASSEMBLY_DIR1=/home/ec2here/spark_home1/assembly/target/scala-2.10
      ++ [[ -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ]]
      ++ '[' -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ']'
      ++ export SPARK_SCALA_VERSION=2.10
      ++ SPARK_SCALA_VERSION=2.10
      + '[' -n /usr/lib/jvm/jre-1.8.0-openjdk.x86_64 ']'
      + RUNNER=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java
      + '[' -d /home/ec2here/spark_home1/jars ']'
      + SPARK_JARS_DIR=/home/ec2here/spark_home1/jars
      + '[' '!' -d /home/ec2here/spark_home1/jars ']'
      + LAUNCH_CLASSPATH='/home/ec2here/spark_home1/jars/*'
      + '[' -n '' ']'
      + [[ -n '' ]]
      + CMD=()
      + IFS=
      + read -d '' -r ARG
      ++ build_command org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status driver-20190417130324-0009
      ++ /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -Xmx128m -cp '/home/ec2here/spark_home1/jars/*' org.apache.spark.launcher.Main org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status driver-20190417130324-0009
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      ++ printf '%d\0' 0
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + COUNT=10
      + LAST=9
      + LAUNCHER_EXIT_CODE=0
      + [[ 0 =~ ^[0-9]+$ ]]
      + '[' 0 '!=' 0 ']'
      + CMD=("${CMD[@]:0:$LAST}")
      + exec /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -cp '/home/ec2here/spark_home1/conf/:/home/ec2here/spark_home1/jars/*' -Xmx2048m org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status driver-20190417130324-0009
      Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
      19/04/17 14:03:27 INFO RestSubmissionClient: Submitting a request for the status of submission driver-20190417130324-0009 in spark://domainhere:6066.
      19/04/17 14:03:28 INFO RestSubmissionClient: Server responded with SubmissionStatusResponse:

      { "action" : "SubmissionStatusResponse", "driverState" : "FAILED", "serverSparkVersion" : "2.3.3", "submissionId" : "driver-20190417130324-0009", "success" : true, "workerHostPort" : "x.y.211.40:11819", "workerId" : "worker-20190417115840-x.y.211.40-11819" }

      [ec2here@ip-x-y-160-225 ~]$

       

      Spark 2.3.3:

      [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home/bin/spark-class org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status driver-20190417130324-0009
      + '[' -z '' ']'
      ++ dirname /home/ec2here/spark_home/bin/spark-class
      + source /home/ec2here/spark_home/bin/find-spark-home
      ++++ dirname /home/ec2here/spark_home/bin/spark-class
      +++ cd /home/ec2here/spark_home/bin
      +++ pwd
      ++ FIND_SPARK_HOME_PYTHON_SCRIPT=/home/ec2here/spark_home/bin/find_spark_home.py
      ++ '[' '!' -z '' ']'
      ++ '[' '!' -f /home/ec2here/spark_home/bin/find_spark_home.py ']'
      ++++ dirname /home/ec2here/spark_home/bin/spark-class
      +++ cd /home/ec2here/spark_home/bin/..
      +++ pwd
      ++ export SPARK_HOME=/home/ec2here/spark_home
      ++ SPARK_HOME=/home/ec2here/spark_home
      + . /home/ec2here/spark_home/bin/load-spark-env.sh
      ++ '[' -z /home/ec2here/spark_home ']'
      ++ '[' -z '' ']'
      ++ export SPARK_ENV_LOADED=1
      ++ SPARK_ENV_LOADED=1
      ++ export SPARK_CONF_DIR=/home/ec2here/spark_home/conf
      ++ SPARK_CONF_DIR=/home/ec2here/spark_home/conf
      ++ '[' -f /home/ec2here/spark_home/conf/spark-env.sh ']'
      ++ set -a
      ++ . /home/ec2here/spark_home/conf/spark-env.sh
      +++ export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
      +++ JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
      ++++ ulimit -n 1048576
      ++ set +a
      ++ '[' -z '' ']'
      ++ ASSEMBLY_DIR2=/home/ec2here/spark_home/assembly/target/scala-2.11
      ++ ASSEMBLY_DIR1=/home/ec2here/spark_home/assembly/target/scala-2.12
      ++ [[ -d /home/ec2here/spark_home/assembly/target/scala-2.11 ]]
      ++ '[' -d /home/ec2here/spark_home/assembly/target/scala-2.11 ']'
      ++ export SPARK_SCALA_VERSION=2.12
      ++ SPARK_SCALA_VERSION=2.12
      + '[' -n /usr/lib/jvm/jre-1.8.0-openjdk.x86_64 ']'
      + RUNNER=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java
      + '[' -d /home/ec2here/spark_home/jars ']'
      + SPARK_JARS_DIR=/home/ec2here/spark_home/jars
      + '[' '!' -d /home/ec2here/spark_home/jars ']'
      + LAUNCH_CLASSPATH='/home/ec2here/spark_home/jars/*'
      + '[' -n '' ']'
      + [[ -n '' ]]
      + set +o posix
      + CMD=()
      + IFS=
      + read -d '' -r ARG
      ++ build_command org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status driver-20190417130324-0009
      ++ /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -Xmx128m -cp '/home/ec2here/spark_home/jars/*' org.apache.spark.launcher.Main org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status driver-20190417130324-0009
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      ++ printf '%d\0' 0
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + COUNT=10
      + LAST=9
      + LAUNCHER_EXIT_CODE=0
      + [[ 0 =~ ^[0-9]+$ ]]
      + '[' 0 '!=' 0 ']'
      + CMD=("${CMD[@]:0:$LAST}")
      + exec /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -cp '/home/ec2here/spark_home/conf/:/home/ec2here/spark_home/jars/*' -Xmx2048m org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status driver-20190417130324-0009
      [ec2here@ip-x-y-160-225 ~]$ ps -ef | grep -i spark

       

      This means Apache Airflow does not work with spark 2.3.x as the spark submit operator stays in running state forever as it does not get response from spark rest status calls.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            toopt4 t oo
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment