Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-1251 Python 3 Support
  3. BEAM-6522

Dill fails to pickle avro.RecordSchema classes on Python 3.

Details

    • Sub-task
    • Status: Resolved
    • P2
    • Resolution: Not A Problem
    • None
    • 2.16.0
    • sdk-py-core
    • None

    Description

      The avroio module still has 4 failing tests. This is actually 2 times the same 2 tests, both for Avro and Fastavro.

      apache_beam.io.avroio_test.TestAvro.test_sink_transform
      apache_beam.io.avroio_test.TestFastAvro.test_sink_transform

      fail with:

      Traceback (most recent call last):
      File "/home/robbe/workspace/beam/sdks/python/apache_beam/io/avroio_test.py", line 432, in test_sink_transform
      | avroio.WriteToAvro(path, self.SCHEMA, use_fastavro=self.use_fastavro)
      File "/home/robbe/workspace/beam/sdks/python/apache_beam/pvalue.py", line 112, in __or__
      return self.pipeline.apply(ptransform, self)
      File "/home/robbe/workspace/beam/sdks/python/apache_beam/pipeline.py", line 515, in apply
      pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
      File "/home/robbe/workspace/beam/sdks/python/apache_beam/runners/runner.py", line 193, in apply
      return m(transform, input, options)
      File "/home/robbe/workspace/beam/sdks/python/apache_beam/runners/runner.py", line 199, in apply_PTransform
      return transform.expand(input)
      File "/home/robbe/workspace/beam/sdks/python/apache_beam/io/avroio.py", line 528, in expand
      return pcoll | beam.io.iobase.Write(self._sink)
      File "/home/robbe/workspace/beam/sdks/python/apache_beam/pvalue.py", line 112, in __or__
      return self.pipeline.apply(ptransform, self)
      File "/home/robbe/workspace/beam/sdks/python/apache_beam/pipeline.py", line 515, in apply
      pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
      File "/home/robbe/workspace/beam/sdks/python/apache_beam/runners/runner.py", line 193, in apply
      return m(transform, input, options)
      File "/home/robbe/workspace/beam/sdks/python/apache_beam/runners/runner.py", line 199, in apply_PTransform
      return transform.expand(input)
      File "/home/robbe/workspace/beam/sdks/python/apache_beam/io/iobase.py", line 960, in expand
      return pcoll | WriteImpl(self.sink)
      File "/home/robbe/workspace/beam/sdks/python/apache_beam/pvalue.py", line 112, in __or__
      return self.pipeline.apply(ptransform, self)
      File "/home/robbe/workspace/beam/sdks/python/apache_beam/pipeline.py", line 515, in apply
      pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
      File "/home/robbe/workspace/beam/sdks/python/apache_beam/runners/runner.py", line 193, in apply
      return m(transform, input, options)
      File "/home/robbe/workspace/beam/sdks/python/apache_beam/runners/runner.py", line 199, in apply_PTransform
      return transform.expand(input)
      File "/home/robbe/workspace/beam/sdks/python/apache_beam/io/iobase.py", line 979, in expand
      lambda _, sink: sink.initialize_write(), self.sink)
      File "/home/robbe/workspace/beam/sdks/python/apache_beam/transforms/core.py", line 1103, in Map
      pardo = FlatMap(wrapper, *args, **kwargs)
      File "/home/robbe/workspace/beam/sdks/python/apache_beam/transforms/core.py", line 1054, in FlatMap
      pardo = ParDo(CallableWrapperDoFn(fn), *args, **kwargs)
      File "/home/robbe/workspace/beam/sdks/python/apache_beam/transforms/core.py", line 864, in __init__
      super(ParDo, self).__init__(fn, *args, **kwargs)
      File "/home/robbe/workspace/beam/sdks/python/apache_beam/transforms/ptransform.py", line 646, in __init__
      self.args = pickler.loads(pickler.dumps(self.args))
      File "/home/robbe/workspace/beam/sdks/python/apache_beam/internal/pickler.py", line 247, in loads
      return dill.loads(s)
      File "/home/robbe/workspace/beam/sdks/python/.eggs/dill-0.2.9-py3.5.egg/dill/_dill.py", line 317, in loads
      return load(file, ignore)
      File "/home/robbe/workspace/beam/sdks/python/.eggs/dill-0.2.9-py3.5.egg/dill/_dill.py", line 305, in load
      obj = pik.load()
      File "/home/robbe/workspace/beam/sdks/python/target/.tox/py3/lib/python3.5/site-packages/avro/schema.py", line 173, in __setitem__
      % (key, value, self))
      Exception: Attempting to map key 'favorite_color' to value <avro.schema.Field object at 0x7f8f72d0d0b8> in ImmutableDict {}
      
      

       

      apache_beam.io.avroio_test.TestAvro.test_split_points
      apache_beam.io.avroio_test.TestFastAvro.test_split_points

      fail with:

       

      Traceback (most recent call last):
       File "/home/robbe/workspace/beam/sdks/python/apache_beam/io/avroio_test.py", line 308, in test_split_points
       self.assertEquals(split_points_report[-10:], [(2, 1)] * 10)
      AssertionError: Lists differ: [(10, 1), (10, 1), (10, 1), (10, 1), (10, 1[42 chars], 1)] != [(2, 1), (2, 1), (2, 1), (2, 1), (2, 1), (2[32 chars], 1)]
      First differing element 0:
      (10, 1)
      (2, 1)
      + [(2, 1), (2, 1), (2, 1), (2, 1), (2, 1), (2, 1), (2, 1), (2, 1), (2, 1), (2, 1)]
      - [(10, 1),
      - (10, 1),
      - (10, 1),
      - (10, 1),
      - (10, 1),
      - (10, 1),
      - (10, 1),
      - (10, 1),
      - (10, 1),
      - (10, 1)] 
      

      Attachments

        Issue Links

          Activity

            People

              tvalentyn Valentyn Tymofieiev
              RobbeSneyders Robbe
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 11.5h
                  11.5h