Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-8608

Chain DoFns in Flink batch runner when possible.

Details

    • Improvement
    • Status: Resolved
    • P2
    • Resolution: Fixed
    • 2.16.0
    • 2.23.0
    • runner-flink
    • None

    Description

      Right now, in Batch runner, DoFn is executed using MapPartition operator (FlinkDoFnFunction), which doesn't have chained driver implementation.

      We need to reimplement DoFnFunction with FlatMap to allow chaining.

      Attached is the execution graph for the same pipeline, before and after the patch.

      Attachments

        1. Screen Shot 2019-11-07 at 10.35.22.png
          283 kB
          David Morávek
        2. Screen Shot 2019-11-07 at 10.35.07.png
          489 kB
          David Morávek

        Issue Links

          Activity

            People

              dmvk David Morávek
              dmvk David Morávek
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m