Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-1251 Python 3 Support
  3. BEAM-7527

Python 3 test parallelization causes test flakines due to ModuleNotFoundError.

Details

    Description

      I am seeing several errors in Python SDK Integration test suites, such as Dataflow ValidatesRunner and Python PostCommit that fail due to one of the autogenerated files not being found.

      For example:

      /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/__init__.py:84: UserWarning: Running the Apache Beam SDK on Python 3 is not yet fully supported. You may encounter buggy behavior or missing features.
        'Running the Apache Beam SDK on Python 3 is not yet fully supported. '
      Failure: ModuleNotFoundError (No module named 'beam_runner_api_pb2') ... 
      ERROR
      ======================================================================
      ERROR: Failure: ModuleNotFoundError (No module named 'beam_runner_api_pb2')
      ----------------------------------------------------------------------
      Traceback (most recent call last):
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/site-packages/nose/failure.py", line 39, in runTest
          raise self.exc_val.with_traceback(self.tb)
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/site-packages/nose/loader.py", line 418, in loadTestsFromName
          addr.filename, addr.module)
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/site-packages/nose/importer.py", line 47, in importFromPath
          return self.importFromDir(dir_path, fqname)
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/site-packages/nose/importer.py", line 94, in importFromDir
          mod = load_module(part_fqname, fh, filename, desc)
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/imp.py", line 245, in load_module
          return load_package(name, filename)
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/imp.py", line 217, in load_package
          return _load(spec)
        File "<frozen importlib._bootstrap>", line 684, in _load
        File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
        File "<frozen importlib._bootstrap_external>", line 678, in exec_module
        File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/__init__.py", line 97, in <module>
          from apache_beam import coders
        File "/home/jenkins/jenkins-slave/workspace/beam_Pos
      tCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/coders/__init__.py", line 19, in <module>
          from apache_beam.coders.coders import *
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/coders/coders.py", line 32, in <module>
          from apache_beam.coders import coder_impl
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/coders/coder_impl.py", line 44, in <module>
          from apache_beam.utils import windowed_value
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/utils/windowed_value.py", line 34, in <module>
          from apache_beam.utils.timestamp import MAX_TIMESTAMP
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/utils/timestamp.py", line 34, in <module>
          from apache_beam.portability import common_urns
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/portability/common_urns.py", line 25, in <module>
          from apache_beam.portability.api import metrics_pb2
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/portability/api/metrics_pb2.py", line 16, in <module>
          import beam_runner_api_pb2 as beam__runner__api__pb2
      ModuleNotFoundError: No module named 'beam_runner_api_pb2'
      
      /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/__init__.py:84: UserWarning: Running the Apache Beam SDK on Python 3 is not yet fully supported. You may encounter buggy behavior or missing features.
        'Running the Apache Beam SDK on Python 3 is not yet fully supported. '
      Failure: ModuleNotFoundError (No module named 'endpoints_pb2') ... 
      ERROR
      ======================================================================
      ERROR: Failure: ModuleNotFoundError (No module named 'endpoints_pb2')
      ----------------------------------------------------------------------
      Traceback (most recent call last):
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/site-packages/nose/failure.py", line 39, in runTest
          raise self.exc_val.with_traceback(self.tb)
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/site-packages/nose/loader.py", line 418, in loadTestsFromName
          addr.filename, addr.module)
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/site-packages/nose/importer.py", line 47, in importFromPath
          return self.importFromDir(dir_path, fqname)
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/site-packages/nose/importer.py", line 94, in importFromDir
          mod = load_module(part_fqname, fh, filename, desc)
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/imp.py", line 245, in load_module
          return load_package(name, filename)
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/build/gradleenv/-1734967053/lib/python3.6/imp.py", line 217, in load_package
          return _load(spec)
        File "<frozen importlib._bootstrap>", line 684, in _load
        File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
        File "<frozen importlib._bootstrap_external>", line 678, in exec_module
        File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/__init__.py", line 97, in <module>
          from apache_beam import coders
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommi
      t_Py_VR_Dataflow/src/sdks/python/apache_beam/coders/__init__.py", line 19, in <module>
          from apache_beam.coders.coders import *
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/coders/coders.py", line 32, in <module>
          from apache_beam.coders import coder_impl
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/coders/coder_impl.py", line 44, in <module>
          from apache_beam.utils import windowed_value
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/utils/windowed_value.py", line 34, in <module>
          from apache_beam.utils.timestamp import MAX_TIMESTAMP
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/utils/timestamp.py", line 34, in <module>
          from apache_beam.portability import common_urns
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/portability/common_urns.py", line 24, in <module>
          from apache_beam.portability.api import beam_runner_api_pb2
        File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/portability/api/beam_runner_api_pb2.py", line 16, in <module>
          import endpoints_pb2 as endpoints__pb2
      ModuleNotFoundError: No module named 'endpoints_pb2'
      

      The rootcause is not clear, I suspect that it may be related to the way we parallelize execution of Python test suites for 2.7, 3.5, 3.6, 3.7.

      cc: altay markflyhigh Juta frederik

      Attachments

        Issue Links

          Activity

            People

              markflyhigh Mark Liu
              tvalentyn Valentyn Tymofieiev
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: