[OOZIE-2479] SparkContext Not Using Yarn Config - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 4.2.0
Fix Version/s: None
Component/s: workflow
Labels:
None
Environment:

Oozie 4.2.0.2.3.4.0-3485
Spark 1.4.1
Scala 2.10.5
HDP 2.3

Description

The spark action does not appear to use the jobTracker setting in job.properties (or in the yarn config) when creating the SparkContext. When jobTracker property is set to use myDomain:8050 (to match the yarn.resourcemanager.address setting), I can see in the oozie UI (click on job > action > action configuration) that myDomain:8050 is being submitted but when I drill down into the hadoop job history logs I see the error indicating that a default 0.0.0.0:8032 is being used:

job.properties

nameNode=hdfs://myDomain:8020
jobTracker=myOtherDomain:8050
queueName=default
master=yarn # have also tried yarn-cluster and yarn-client
 
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/bmp/
oozie.action.sharelib.for.spark=spark2 # I've added the updated spark libs I need in here

workflow

<workflow-app xmlns='uri:oozie:workflow:0.5' name='MyWorkflow'>
    <start to='spark-node' />
    <action name='spark-node'>
        <spark xmlns="uri:oozie:spark-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <prepare>
                <delete path="${nameNode}/bmp/output"/>
            </prepare>
            <master>${master}</master>
            <name>My Workflow</name>
            <class&gt;uk.co.bmp.drivers.MyDriver</class&gt;
            <jar>${nameNode}/bmp/lib/bmp.spark-assembly-1.0.jar</jar>
            <spark-opts>--conf spark.yarn.historyServer.address=http://myDomain:18088 --conf spark.eventLog.dir=hdfs://myDomain/user/spark/applicationHistory --conf spark.eventLog.enabled=true</spark-opts>
            <arg>${nameNode}/bmp/input/input_file.csv</arg>
        </spark>
        <ok to="end" />
        <error to="fail" />
    </action>
    <kill name="fail">
        <message>Workflow failed, error
            message[${wf:errorMessage(wf:lastErrorNode())}]
        </message>
    </kill>
    <end name='end' />
</workflow-app>

Error

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception,Call From myDomain/ipAddress to 0.0.0.0:8032 failed on connection exception: java.net.ConnectException: Connection refused. For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
...
at org.apache.spark.SparkContext.<init>(SparkContext.scala:497)
...

Where is it pulling 8032 from? Why does it not use the port configured in the job.properties?

Attachments

Activity

People

Assignee:: Satish Saley

Reporter:: Breandán Mac Parland

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 04/Mar/16 16:44

Updated:: 09/May/16 09:30