Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-42425

spark-hadoop-cloud is not provided in the default Spark distribution

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • 3.3.1
    • None
    • Input/Output
    • None

    Description

      The library spark-hadoop-cloud is absent in the default Spark distribution (as well as its dependencies like hadoop-aws). Therefore the dependency management section described in Integration with Cloud Infrastructures is invalid. Actually the libraries for cloud integration are not provided.

      A naive workaround would be to add the spark-hadoop-cloud library as a compile-scope dependency. However, this does not work due to Spark classpath hierarchy. Spark system classloader does not see classes loaded by the application classloader.

      Therefore a proper fix would be to enable the hadoop-cloud build profile by default: -Phadoop-cloud

      Attachments

        Activity

          People

            Unassigned Unassigned
            tashoyan Arseniy Tashoyan
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: