Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22216

Improving PySpark/Pandas interoperability

    XMLWordPrintableJSON

Details

    • Epic
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 2.2.0
    • None
    • PySpark

    Description

      This is an umbrella ticket tracking the general effort to improve performance and interoperability between PySpark and Pandas. The core idea is to Apache Arrow as serialization format to reduce the overhead between PySpark and Pandas.

      Attachments

        Issue Links

          There are no Sub-Tasks for this issue.

          Activity

            People

              icexelloss Li Jin
              icexelloss Li Jin
              Votes:
              0 Vote for this issue
              Watchers:
              31 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: