Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-16145

spark-ec2 script on 1.6.1 does allow instances to use sqlContext

Rank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • 1.6.1
    • None
    • EC2
    • None
    • AWS EC2

    Description

      Downloaded 1.6.1 for Hadoop 2.4.

      I used the spark-ec2 script to create a cluster and I'm running into an issue which prevents importing sqlContext. Reading prior reports I looked at the output to find the first error:

      java.lang.RuntimeException: java.io.IOException: Filesystem closed

      Not sure how to diagnose this. Exiting the Spark REPL and reentering, every subsequent time I get this error:

      java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rwx-x-x

      I assume that some env script is specifying this, since /tmp/hive doesn't exist. I thought that this would be taken care of by the spark-ec2 script so you could just go to town.

      I have no experience with HDFS. I have used Spark on Cassandra and on of S3, but I've never deployed it myself. I tried this:

      root@ip-172-31-57-109 ephemeral-hdfs]$ bin/hadoop fs -ls

      Warning: $HADOOP_HOME is deprecated.

      ls: Cannot access .: No such file or directory.

      I did see that under /mnt there is the ephemeral-hdfs folder which is in core-site.xml, but there is no tmp folder.

      I tried again with the download for Hadoop 1.x.

      Same behavior.

      It's curious to me that spark-ec2 has an argument for specifying the Hadoop version; is this required? It would seem that you've already specified it when downloading.

      I tried to create the path "tmp/hive" under /mnt/ephemeral-hdfs and chmod to 777. No joy.

      sqlContext is obviously a critical part of the Spark platform. The interesting thing is that I don't need HDFS at all - I'm going to be reading from S3 and writing to MySQL.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            rabinnh Richard Bross
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment