Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26080

Unable to run worker.py on Windows

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 2.4.0
    • 2.4.1, 3.0.0
    • PySpark
    • Windows 10 Education 64 bit

    Description

      Use of the resource module in python means worker.py cannot run on a windows system. This package is only available in unix based environments.
      https://github.com/apache/spark/blob/9a5fda60e532dc7203d21d5fbe385cd561906ccb/python/pyspark/worker.py#L25

      textFile = sc.textFile("README.md")
      textFile.first()
      

      When the above commands are run I receive the error 'worker failed to connect back', and I can see an exception in the console coming from worker.py saying 'ModuleNotFoundError: No module named resource'

      I do not really know enough about what I'm doing to fix this myself. Apologies if there's something simple I'm missing here.

      Attachments

        Issue Links

          Activity

            People

              gurwls223 Hyukjin Kwon
              HaydenJ Hayden Jeune
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: