Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-3646

Add comments about appropriate use of DoFn.Teardown

Details

    • Bug
    • Status: Resolved
    • P1
    • Resolution: Fixed
    • None
    • 2.4.0
    • sdk-java-core

    Description

      Because the Teardown method has no relation to the atomicity of processing and commiting of output, it is EXTREMELY DANGEROUS to use to flush outputs, and buffered data there is extremely likely to never be flushed. If a DoFn instance with buffered data is lost (for example, via worker/machine failure), and the runner has committed the result of processing that input, the data is lost.

       

      Not commenting on this being the case can cause users to believe that (especially if running a batch pipeline) that their data will be flushed on pipeline completion. This is very dangerous behavior that we do not warn of sufficiently.

      Attachments

        Issue Links

          Activity

            People

              tgroh Thomas Groh
              tgroh Thomas Groh
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 20m
                  1h 20m