Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-13391

Use Apache Arrow as In-memory columnar store implementation

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 1.6.0, 2.0.0
    • None
    • Spark Core, SQL
    • None

    Description

      Idea.
      Apache Arrow (http://arrow.apache.org/) is Open Source implementation of inmemory columnar store. It has APIs in many programming languages.
      We can think about using it in Apache Spark to avoid data (de-)serialization
      when running PySpark (and R) UDFs.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              maver1ck Maciej BryƄski
              Votes:
              1 Vote for this issue
              Watchers:
              24 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: