Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
1.6.0, 2.0.0
-
None
-
None
Description
Idea.
Apache Arrow (http://arrow.apache.org/) is Open Source implementation of inmemory columnar store. It has APIs in many programming languages.
We can think about using it in Apache Spark to avoid data (de-)serialization
when running PySpark (and R) UDFs.
Attachments
Issue Links
- duplicates
-
SPARK-13534 Implement Apache Arrow serializer for Spark DataFrame for use in DataFrame.toPandas
- Resolved