Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-913

Create the skeleton for a Dataset API Spark runner

Details

    • Wish
    • Status: Resolved
    • P2
    • Resolution: Invalid
    • None
    • Not applicable
    • runner-spark
    • None

    Description

      As discussed in Beam Dev list, we should have a second runner for Spark based on the Dataset API.
      As part of this the Spark runner will have three modules: runner-spark-core, runner-spark-rdd (Spark 1.6.x) and runner-spark-dataset (Spark 2.x).

      This work should go in a feature branch (runner-spark2 already exists).

      This ticket is about creating a skeleton for the structure mentioned, and everything that can be easily ported from the current runner.

      Some of the work is already in the current feature branch, but a lot has changed since it was last updated.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              amitsela Amit Sela
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: