Details
-
Bug
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
1.7.0
-
None
-
None
Description
We have 3 node cluster with mesos(V 1.7), spark 2.4 and docker 18.06.1 installed. one is master and other two are agents. while doing spark submit job fails. UI shows only one task(Driver) launching on one of the slaves in the failed state.
command:
spark-submit \
--master mesos://****.32:7077 \
--deploy-mode cluster \
--class com.learning.spark.WordCount \
--conf spark.mesos.executor.docker.image=mesosphere/spark:2.4.0-2.2.1-3-hadoop-2.7 \
--conf spark.master.rest.enabled=true \
/home/mapr/mesos/wordcount.jar hdfs://**.36:8020/user/mapr/sparkL/input.txt hdfs://**.36:8020/user/output
Error in one of the Logs:
Running on machine: **-i0058
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
W1221 16:51:23.857431 17978 state.cpp:478] Failed to find executor forked pid file '/home/**/mesos/mesos-1.7.0/build/workDir/meta/slaves/822a5d52-b8ba-459f-ade2-7f3a2ebd240f-S0/frameworks/77c39bdf-09e3-4cb9-9026-21e900d08318-0007/executors/driver-20181221112019-0006/runs/7c1399ca-4e0a-4bd9-b02e-9c5ca3854c77/pids/forked.pid'
Below is the only property that we have set on all the nodes and have started the dispatcher:
export MESOS_NATIVE_JAVA_LIBRARY=/usr/local/libmesos.so