[SPARK-2753] Is it supposed --archives option in yarn cluster mode to uncompress file? - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Not A Problem
Affects Version/s: 1.0.0
Fix Version/s: None
Component/s: YARN
Labels:
- archives
- cache
- distributed
- yarn
Environment:

CentOS release 6.5 (64 bits) and Hadoop 2.2.0

Description

Hi all,

this is my first sent issue, I googled and searche dinto the Spark code and arrived here.

When passing as argument to --archives a tar.gz or a .zip file, Spark uploads it to the distributed cache, but it is not uncompressing it.

According the documentation, it is supposed to uncompress it, is this a bug??

Launching command is:

/opt/spark-1.0.1/bin/spark-submit --class ProlnatSpark --master yarn-cluster --num-executors 32 --driver-library-path /opt/hadoop/hadoop-2.2.0/lib/native/ --driver-memory 390m --executor-memory 890m --executor-cores 1 --archives=Diccionarios.tar.gz --verbose ProlnatSpark.jar Wikipedias/WikipediaPlain.txt saidaWikipediaSpark

In files /yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala and /yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnableUtil.scala doesn't seem to uncompress the files.

I hope this helps, thank you very much

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: José Manuel Abuín Mosquera

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 30/Jul/14 15:24

Updated:: 19/Aug/14 13:18

Resolved:: 19/Aug/14 13:18