Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-10846

Stray META-INF in directory spark-shell is launched from causes problems

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Incomplete
    • 1.5.0
    • None
    • Spark Shell

    Description

      I observed some perplexing errors while running $SPARK_HOME/bin/spark-shell yesterday (with $SPARK_HOME pointing at a clean 1.5.0 install):

      java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.fs.s3.S3FileSystem not found
      

      while initializing HiveContext; full example output is here.

      The issue was that a stray META-INF directory from some other project I'd built months ago was sitting in the directory that I'd run spark-shell from (not in my $SPARK_HOME, just in the directory I happened to be in when I ran $SPARK_HOME/bin/spark-shell).

      That META-INF had a services/org.apache.hadoop.fs.FileSystem file specifying some provider classes (S3FileSystem in the example above) that were unsurprisingly not resolvable by Spark.

      I'm not sure if this is purely my fault for attempting to run Spark from a directory with another project's config files laying around, but I find it somewhat surprising that, given a $SPARK_HOME pointing to a clean Spark install, that $SPARK_HOME/bin/spark-shell picks up detritus from the cwd it is called from, so I wanted to at least document it here.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              rdub Ryan Williams
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: