Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-1251 Python 3 Support
  3. BEAM-6158

Using --save_main_session fails on Python 3 when main module has invocations of superclass method using 'super' .

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: sdk-py-harness
    • Labels:
      None

      Description

      A typical manifestation of this failure, which can be observed on several Beam examples:

      Traceback (most recent call last):
        File "/usr/lib/python3.5/runpy.py", line 193, in _run_module_as_main
          "__main__", mod_spec)
        File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
          exec(code, run_globals)
        File "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/examples/complete/game/user_score.py", line 164, in <module>                                                
          run()
        File "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/examples/complete/game/user_score.py", line 158, in run                                                     
          | 'WriteUserScoreSums' >> beam.io.WriteToText(args.output))
        File "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/pipeline.py", line 426, in __exit__                                                                         
          self.run().wait_until_finish()
        File "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/runners/dataflow/dataflow_runner.py", line 1338, in wait_until_finish                                       
          (self.state, getattr(self._runner, 'last_error_msg', None)), self)
      apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: Dataflow pipeline failed. State: FAILED, Error:                                                                                            
      Traceback (most recent call last):
        File "/usr/local/lib/python3.5/site-packages/dataflow_worker/batchworker.py", line 773, in run
          self._load_main_session(self.local_staging_directory)
        File "/usr/local/lib/python3.5/site-packages/dataflow_worker/batchworker.py", line 489, in _load_main_session                                                                                                   
          pickler.load_session(session_file)
        File "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", line 280, in load_session                                                                                                        
          return dill.load_session(file_path)
        File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 410, in load_session
          module = unpickler.load()
        File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 474, in find_class
          return StockUnpickler.find_class(self, module, name)
      AttributeError: Can't get attribute 'ParseGameEventFn' on <module 'dataflow_worker.start' from '/usr/local/lib/python3.5/site-packages/dataflow_worker/start.py'> 

       
      Note that the example has the following code [1]:

      class ParseGameEventFn(beam.DoFn):
        def __init__(self):
          super(ParseGameEventFn, self).__init__()
      

      https://github.com/apache/beam/blob/0325c360bef17a6673e2d43051e59174b8e5ccc9/sdks/python/apache_beam/examples/complete/game/user_score.py#L81

      +cc: Valentyn Tymofieiev Robert Bradshaw Ahmet Altay

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                tvalentyn Valentyn Tymofieiev
                Reporter:
                markflyhigh Mark Liu
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4h 10m
                  4h 10m