[SPARK-13391] Use Apache Arrow as In-memory columnar store implementation - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: 1.6.0, 2.0.0
Fix Version/s: None
Component/s: Spark Core, SQL
Labels:
None

Description

Idea.
Apache Arrow (http://arrow.apache.org/) is Open Source implementation of inmemory columnar store. It has APIs in many programming languages.
We can think about using it in Apache Spark to avoid data (de-)serialization
when running PySpark (and R) UDFs.

Attachments

Issue Links

duplicates

SPARK-13534 Implement Apache Arrow serializer for Spark DataFrame for use in DataFrame.toPandas

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Maciej Bryński

Votes:: 1 Vote for this issue

Watchers:: 24 Start watching this issue

Dates

Created:: 19/Feb/16 10:44

Updated:: 29/Feb/16 09:28

Resolved:: 29/Feb/16 09:28