Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-1251 Python 3 Support
  3. BEAM-6158

Using --save_main_session fails on Python 3 when main module has invocations of superclass method using 'super' .

Details

    • Sub-task
    • Status: Triage Needed
    • P3
    • Resolution: Unresolved
    • None
    • None
    • sdk-py-harness
    • None

    Description

      A typical manifestation of this failure, which can be observed on several Beam examples:

      Traceback (most recent call last):
        File "/usr/lib/python3.5/runpy.py", line 193, in _run_module_as_main
          "__main__", mod_spec)
        File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
          exec(code, run_globals)
        File "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/examples/complete/game/user_score.py", line 164, in <module>                                                
          run()
        File "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/examples/complete/game/user_score.py", line 158, in run                                                     
          | 'WriteUserScoreSums' >> beam.io.WriteToText(args.output))
        File "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/pipeline.py", line 426, in __exit__                                                                         
          self.run().wait_until_finish()
        File "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/runners/dataflow/dataflow_runner.py", line 1338, in wait_until_finish                                       
          (self.state, getattr(self._runner, 'last_error_msg', None)), self)
      apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: Dataflow pipeline failed. State: FAILED, Error:                                                                                            
      Traceback (most recent call last):
        File "/usr/local/lib/python3.5/site-packages/dataflow_worker/batchworker.py", line 773, in run
          self._load_main_session(self.local_staging_directory)
        File "/usr/local/lib/python3.5/site-packages/dataflow_worker/batchworker.py", line 489, in _load_main_session                                                                                                   
          pickler.load_session(session_file)
        File "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", line 280, in load_session                                                                                                        
          return dill.load_session(file_path)
        File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 410, in load_session
          module = unpickler.load()
        File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 474, in find_class
          return StockUnpickler.find_class(self, module, name)
      AttributeError: Can't get attribute 'ParseGameEventFn' on <module 'dataflow_worker.start' from '/usr/local/lib/python3.5/site-packages/dataflow_worker/start.py'> 

       
      Note that the example has the following code [1]:

      class ParseGameEventFn(beam.DoFn):
        def __init__(self):
          super(ParseGameEventFn, self).__init__()
      

      https://github.com/apache/beam/blob/0325c360bef17a6673e2d43051e59174b8e5ccc9/sdks/python/apache_beam/examples/complete/game/user_score.py#L81

      +cc: tvalentyn robertwb altay

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              markflyhigh Mark Liu
              Votes:
              3 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4h 10m
                  4h 10m