Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-6902

Beam model contract for finalization of CheckpointMark's

Details

    Description

      Question: What is the contract in Beam model for when checkpoint marks shall be finalized, is there any ? 

      I'm working on pipeline that reads messages from Kafka using KafkaIO, and I'm looking at commitOffsetsInFinalize() option, and KafkaCheckpointMark class.

      I want to achieve at-least-once message delivery semantics and want to be sure that offsets committed to Kafka after they are written to some sink.

      Looking at interface of CheckpointMark  it's not clear when finalization shall be expected to happen.

      Is it runner dependent, what to expect when executing on DataflowRunner ?

      And reading KafkaIO.Read javadoc on commitOffsetsInFinalize 

      https://beam.apache.org/releases/javadoc/2.9.0/org/apache/beam/sdk/io/kafka/KafkaIO.Read.html#commitOffsetsInFinalize-- 

      also doesn't bring clarity to my understanding, particularly the phrase 

      But it does not provide hard processing guarantees

      What exactly are hard processing guarantees ?

      Can I ask, please for documentation improvement in respect of CheckpointMark and commitOffsetsInFinalize

       

      Attachments

        1. BEAM-6902 diagrams.pdf
          66 kB
          johntumminaro
        2. beam-kafka-io-commit-model-examples-master.zip
          31 kB
          Mark Norkin

        Activity

          People

            marknorkin Mark Norkin
            marknorkin Mark Norkin
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: