Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-13599

Overflow in Python Datastore RampupThrottlingFn

Details

    • Bug
    • Status: Resolved
    • P2
    • Resolution: Fixed
    • 2.32.0, 2.33.0, 2.34.0, 2.35.0
    • 2.36.0
    • io-py-gcp
    • None

    Description

      File "/usr/local/lib/python3.8/site-packages/apache_beam/io/gcp/datastore/v1new/rampup_throttling_fn.py", line 74, in _calc_max_ops_budget
      max_ops_budget = int(self._BASE_BUDGET / self._num_workers * (1.5**growth))
      RuntimeError: OverflowError: (34, 'Numerical result out of range') `[while running 'Write to Datastore/Enforce throttling during ramp-up-ptransform-483']
      

      An intermediate value is a float dependent on start time, meaning it can run into overflows in long-running pipelines (usually on the ~6th day).

      `max_ops_budget` should either clip to float(inf) or INT_MAX, or short-circuit the throttling decision here since it will long be irrelevant by then.

      Attachments

        Activity

          People

            danthev Daniel Thevessen
            danthev Daniel Thevessen
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 5.5h
                5.5h