Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-44157

Outdated JARs in PySpark package

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 3.4.1
    • None
    • Build, PySpark

    Description

      The JARs which ship embedded within PySpark's package in PyPi don't seem aligned with the deps specified in Spark's own `pom.xml`.

      For example, in Spark's `pom.xml`, `protobuf-java` is set to `3.21.12`:

      https://github.com/apache/spark/blob/6b1ff22dde1ead51cbf370be6e48a802daae58b6/pom.xml#L127

      However, if we look at the JARs embedded within PySpark tarball, the version of `protobuf-java` is `2.5.0` (i.e. `..../site-packages/pyspark/jars/protobuf-java-2.5.0.jar`). Same seems to apply to all other dependencies.

      This introduces a set of CVEs which are fixed on upstream Spark, but are still present in PySpark (e.g. `CVE-2022-3509`, `CVE-2021-22569`, ` CVE-2015-5237` and a few others). As well as potentially introduce a source of conflict whenever there's a breaking change on these deps.

      Attachments

        Activity

          People

            Unassigned Unassigned
            adriangonz Adrian Gonzalez-Martin
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: