Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-10659

ParDo Python streaming load tests timeouts on 200-iterations case

Details

    • Bug
    • Status: Resolved
    • P2
    • Resolution: Fixed
    • None
    • Not applicable
    • testing
    • None

    Description

      Running Python Dataflow load test in streaming option timeouts on Jenkins on case 2:

      2GB 100 byte records 200 times
      

      It iterates same ParDo step sequentially. 

      Jenkins jobs has 2h timeout. Second case usually is cancelled after 1h 47 min. The most suspicious metric here is throughput which in comparison to other jobs doesn't look steady. Sometimes there are spike after 1 hour of non action, or there are several spikes (to 30 000 elements/sec).

      Python batch case scenario takes ~56 minutes, with steady throughput ~7000 elements/sec for almost whole job run.

      In comparison Java same test case takes ~6 minutes. Here throughput goes up to ~100 000 elements/sec then after processing all elements it decreases.

       

       

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              kasiak Kasia Kucharczyk
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: