Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2753

Is it supposed --archives option in yarn cluster mode to uncompress file?

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Not A Problem
    • 1.0.0
    • None
    • YARN
    • CentOS release 6.5 (64 bits) and Hadoop 2.2.0

    Description

      Hi all,

      this is my first sent issue, I googled and searche dinto the Spark code and arrived here.

      When passing as argument to --archives a tar.gz or a .zip file, Spark uploads it to the distributed cache, but it is not uncompressing it.

      According the documentation, it is supposed to uncompress it, is this a bug??

      Launching command is:

      /opt/spark-1.0.1/bin/spark-submit --class ProlnatSpark --master yarn-cluster --num-executors 32 --driver-library-path /opt/hadoop/hadoop-2.2.0/lib/native/ --driver-memory 390m --executor-memory 890m --executor-cores 1 --archives=Diccionarios.tar.gz --verbose ProlnatSpark.jar Wikipedias/WikipediaPlain.txt saidaWikipediaSpark

      In files /yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala and /yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnableUtil.scala doesn't seem to uncompress the files.

      I hope this helps, thank you very much

      Attachments

        Activity

          People

            Unassigned Unassigned
            chema José Manuel Abuín Mosquera
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: