Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-9085

Performance regression in np.random.RandomState() skews performance test results across Python 2/3 on Dataflow

Details

    • Bug
    • Status: Resolved
    • P2
    • Resolution: Fixed
    • None
    • Not applicable
    • testing
    • None

    Description

      Tests show that the performance of core Beam operations in Python 3.x on Dataflow can be a few time slower than in Python 2.7. We should investigate what's the cause of the problem.

      Currently, we have one ParDo test that is run both in Py3 and Py2 [1]. A dashboard with runtime results can be found here [2].

      [1] sdks/python/apache_beam/testing/load_tests/pardo_test.py

      [2] https://apache-beam-testing.appspot.com/explore?dashboard=5678187241537536

      Attachments

        Issue Links

          Activity

            People

              kamilwu Kamil Wasilewski
              kamilwu Kamil Wasilewski
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 9h 20m
                  9h 20m