Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-31341

Spark documentation incorrectly claims 3.8 compatibility

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Cannot Reproduce
    • 2.4.5
    • None
    • PySpark
    • None

    Description

      The Spark documentation (https://spark.apache.org/docs/latest/) has this text:

      Spark runs on Java 8, Python 2.7+/3.4+ and R 3.1+. For the Scala API, Spark 2.4.5 uses Scala 2.12. You will need to use a compatible Scala version (2.12.x).

      Which suggests that Spark is compatible with Python 3.8. This is not true. For example in the latest ubuntu:18.04 docker image:

       

      apt-get update
      apt-get install python3.8 python3-pip
      pip3 install pyspark
      python3.8 -m pip install pyspark
      python3.8 -c 'import pyspark'
      

      Outputs:

      Traceback (most recent call last):
        File "<string>", line 1, in <module>
        File "/usr/local/lib/python3.8/dist-packages/pyspark/__init__.py", line 51, in <module>
          from pyspark.context import SparkContext
        File "/usr/local/lib/python3.8/dist-packages/pyspark/context.py", line 31, in <module>
          from pyspark import accumulators
        File "/usr/local/lib/python3.8/dist-packages/pyspark/accumulators.py", line 97, in <module>
          from pyspark.serializers import read_int, PickleSerializer
        File "/usr/local/lib/python3.8/dist-packages/pyspark/serializers.py", line 72, in <module>
          from pyspark import cloudpickle
        File "/usr/local/lib/python3.8/dist-packages/pyspark/cloudpickle.py", line 145, in <module>
          _cell_set_template_code = _make_cell_set_template_code()
        File "/usr/local/lib/python3.8/dist-packages/pyspark/cloudpickle.py", line 126, in _make_cell_set_template_code
          return types.CodeType(
      TypeError: an integer is required (got type bytes)
      

      I propose the documentation is updated to say "Python 3.4 to 3.7". I also propose the `setup.py` file for pyspark include:

          python_requires=">=3.6,<3.8",
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            danking Daniel King
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: