Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-11644

translations.pack_combiners optimizer causes breaking change to metrics API

Details

    • Bug
    • Status: Resolved
    • P1
    • Resolution: Fixed
    • 2.27.0
    • 2.28.0
    • sdk-py-core
    • None

    Description

      The translations.pack_combiners optimizer causes a breaking change in the public metrics API. The issue arises because metrics are keyed and queryable by step name, and the step name can change after combiner packing. Suppose we have a pipeline that looks like `pipeline | CombinePerKey(combinefn_1); pipeline | CombinePerKey(combinefn_2)` and both combinefn_1 and combinefn_2 increment the same counter per input element. Previously, the result would have two counters, one each for step combinefn_1 and combinefn_2; both will have value num_input_elements. After combiner packing, the result will have one counter for Packed[combinefn_1, combinefn] with value 2 * num_input_elements.

      Unfortunately there is no easy fix for this because the runner has to somehow be aware that a step is a packed step and use the appropriate metrics container for the sub-step.

      The short term workaround is to (1) add a note for 2.27 under known issues and (2) make this phase opt-in in 2.28.

      Attachments

        Activity

          People

            robertwb Robert Bradshaw
            myffical@gmail.com Yifan Mai
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 8h 50m
                8h 50m