Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-9487

GBKs on unbounded pcolls with global windows and no triggers should fail

Details

    Description

      This, according to "4.2.2.1 GroupByKey and unbounded PCollections" in https://beam.apache.org/documentation/programming-guide/.

      If you do apply GroupByKey or CoGroupByKey to a group of unbounded PCollections without setting either a non-global windowing strategy, a trigger strategy, or both for each collection, Beam generates an IllegalStateException error at pipeline construction time.

      Example where this doesn't happen in Python SDK: https://stackoverflow.com/questions/60623246/merge-pcollection-with-apache-beam

      I also believe that this unit test should fail, since test_stream is unbounded, uses global window, and has no triggers.

        def test_global_window_gbk_fail(self):
          with TestPipeline() as p:
            test_stream = TestStream()
            _ = p | test_stream | GroupByKey()
      

      Attachments

        Activity

          People

            zhoufek Zachary Houfek
            udim Udi Meiri
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 38h 20m
                38h 20m