Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: sdk-py-core
    • Labels:
      None

      Description

      I have been trying to use google datalab with python3. As I see there are several packages that does not support python3 yet which google datalab depends on. This is one of them.

      https://github.com/GoogleCloudPlatform/DataflowPythonSDK/issues/6

        Attachments

          Issue Links

          1.
          Support Python native types in Beam typehints Sub-task Open Udi Meiri

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 0.5h
          2.
          Make the coders package compatible with Python 3 Sub-task Resolved Luke Zhu  
          3.
          Enable tests to run in Python 3 Sub-task Closed Luke Zhu

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 4.5h
          4.
          Finish io futurize stage 2: fix the missing pylint3 check in tox.ini Sub-task Resolved Matthias Feys

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2.5h
          5.
          Create a tox environment that uses Py3 interpreter for pre/post commit test suites, once codebase supports Py3. Sub-task Resolved Matthias Feys

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2h
          6.
          Add an SDK harness container with Python 3 interpreter for portable pipelines. Sub-task Resolved Valentyn Tymofieiev

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 40m
          7.
          Exercise Python 3 SDK harness container in ValidatesContainer Jenkins test suite. Sub-task Resolved Mark Liu

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 4h 10m
          8.
          Finish Python 3 porting for coders module Sub-task Resolved Robbe

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 4h
          9.
          Finish Python 3 porting for examples module Sub-task Resolved Robbe

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2.5h
          10.
          Finish Python 3 porting for internal module Sub-task Resolved Robbe

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 0.5h
          11.
          Finish Python 3 porting for io module Sub-task Closed Juta Staes

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 20h
          12.
          Finish Python 3 porting for metrics module Sub-task Resolved Robbe  
          13.
          Finish Python 3 porting for options module Sub-task Resolved Manu Zhang

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 3h 40m
          14.
          Finish Python 3 porting for portability module Sub-task Resolved Robbe  
          15.
          Finish Python 3 porting for runners module Sub-task Resolved Robbe

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 6h 20m
          16.
          Finish Python 3 porting for testing module Sub-task Resolved Robbe

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 5h 50m
          17.
          Finish Python 3 porting for transforms module Sub-task Resolved Robbe

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 3h
          18.
          Finish Python 3 porting for typehints module Sub-task Closed Robbe

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1.5h
          19.
          Finish Python 3 porting for utils module Sub-task Resolved Robbe

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h
          20.
          Finish Python 3 porting for unpackaged files Sub-task Resolved Robbe

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 4h 10m
          21.
          Add tox suites to exercise unit tests using Python3 interpreter with cython, and with gcp dependencies. Sub-task Resolved Robbe

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2h 50m
          22.
          Several tests fail on Python 3 with TypeError: 'cmp' is an invalid keyword argument for this function Sub-task Closed Unassigned

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 3.5h
          23.
          Several tests fail on Python 3 with Failed assert: [<some number>] == [nan] Sub-task Resolved Robbe  
          24.
          Side inputs don't work on Python 3 Sub-task Closed Robert Bradshaw

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1.5h
          25.
          Several tests fail on Python 3 with: unsupported operand type(s) for +: 'int' and 'EmptySideInput' Sub-task Resolved Unassigned  
          26.
          Some tests use assertItemsEqual method, not available in Python 3 Sub-task Resolved Matthias Feys

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 0.5h
          27.
          Several tests fail on Python 3 with TypeError: unorderable types: str() < int() Sub-task Resolved Robbe

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 6h 20m
          28.
          Several tests fail on Python 3 with: Runtime type violation detected Sub-task Closed Unassigned  
          29.
          Several IO tests hang indefinitely during execution on Python 3. Sub-task Resolved Robbe

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 50m
          30.
          Avro IO does not work with avro-python3 package out-of-the-box on Python 3, several tests fail with AttributeError (module 'avro.schema' has no attribute 'parse') Sub-task Resolved Simon

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h
          31.
          Several IO tests fail in Python 3 with RuntimeError('dictionary changed size during iteration',)} Sub-task Resolved Ruoyun Huang

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 6h
          32.
          Investigate why test_split_at_fraction_exhaustive consistently fails to split after 101 attempts on Python 3 Sub-task In Progress Frederik Bode

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 4.5h
          33.
          Several VcfIO tests fail in Python 3 with TypeError: cannot use a string pattern on a bytes-like object Sub-task Open Simon  
          34.
          Several typehints tests fail on Python 3 with ValueError: no signature found for builtin <method 'upper' of 'str' objects> Sub-task Resolved Robbe  
          35.
          Add tox suites for various Python 3 versions (3.5, 3.6, 3.7) Sub-task Closed Robbe

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 5h 20m
          36.
          Default coder breaks with large ints on Python 3 Sub-task Resolved Robert Bradshaw

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 40m
          37.
          Disable compare parameter in Top.Of() combiner when executing in Python 3. Sub-task Resolved Robert Bradshaw

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h
          38.
          Util test on annotations fails Sub-task Resolved Ruoyun Huang

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 5h 10m
          39.
          Using methods in map is broken on Python 3 Sub-task Resolved Unassigned  
          40.
          Validates runner tests fail with: Cannot convert bytes value to JSON value Sub-task Closed Mark Liu  
          41.
          wordcount_fnapi_it failed on TestDataflowRunner because of JSON string decoding error Sub-task Resolved Mark Liu

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 4h 50m
          42.
          Support DoFns with Keyword-only arguments in Python 3. Sub-task Open Juta Staes

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 5h
          43.
          TFRecordio not Py3 compatible Sub-task Closed Robbe

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 50m
          44.
          Enable WordCount example on DataflowRunner on Python 3 Sub-task Resolved Mark Liu

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 4h 40m
          45.
          Gradle setupVirtualenv supports Python 3 Sub-task Resolved Mark Liu

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 3h 50m
          46.
          Revert dill pip install from github commit Sub-task Resolved Valentyn Tymofieiev  
          47.
          Gcsio batch delete broken in Python 3 Sub-task Resolved Mark Liu

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2h
          48.
          Enable support for save_main_session in Python 3 Sub-task Open Valentyn Tymofieiev

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 3.5h
          49.
          Opcounters sampling test fails for some random seeds on Python3 Sub-task Resolved Robbe  
          50.
          TypeError in DataflowRunner: dict_values does not support indexing Sub-task Resolved Mark Liu

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 40m
          51.
          Dill fails to pickle avro.RecordSchema classes on Python 3. Sub-task Open Valentyn Tymofieiev

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 7.5h
          52.
          Parallel tox (unit) tests run on Jenkins Sub-task Closed Mark Liu

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 12h 10m
          53.
          BigQuery IO does not work in Python 3 Sub-task Resolved Valentyn Tymofieiev

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h
          54.
          TypeHints Py3 Error: TrivialInferenceTest.testTupleListComprehension fails on Python 3 Sub-task Open Unassigned  
          55.
          GCS IO tests are very flaky under Python 3.5 Sub-task Resolved Juta Staes

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 3h 50m
          56.
          Dataflow Python runner should use a Python-3 compatible container when starting a Python 3 pipeline. Sub-task Resolved Valentyn Tymofieiev

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 0.5h
          57.
          Add integration test on DirectRunner in Python 3 Sub-task Resolved Mark Liu

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h
          58.
          Beam Python SDK release qualification should verify supported Python 3 versions. Sub-task Closed Valentyn Tymofieiev

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 4h 20m
          59.
          Stager should stage Python 3 wheels for Beam SDK once they are released. Sub-task Closed Valentyn Tymofieiev

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 20m
          60.
          Release Python 3 wheels with first Beam SDK release that supports Python 3. Sub-task Closed Robert Bradshaw  
          61.
          Add PostCommit suite for integration tests on DataflowRunner Sub-task Resolved Mark Liu

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 26h 20m
          62.
          Exercise Dataflow runner integration tests in a postcommit suite for Python 3.5 and 3.6 Sub-task Resolved Juta Staes

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 3h 10m
          63.
          Dataflow ValidatesRunner test suite should also exercise ValidatesRunner tests under Python 3. Sub-task Closed Frederik Bode

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 14h 40m
          64.
          Exercise direct runner integration tests in a postcommit suite for Python 3.5 and 3.6. Sub-task Resolved Juta Staes  
          65.
          SDK source tarball is different when created on Python 2 and Python 3 Sub-task Resolved Valentyn Tymofieiev  
          66.
          Typehinting depends on typing changes in Python 3.5.3 Sub-task Resolved Robbe

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1.5h
          67.
          Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. Sub-task Resolved Pablo Estrada

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 6h 50m
          68.
          Block size difference in avro library on Python3 causes some AvroIO tests to fail. Sub-task Closed Valentyn Tymofieiev

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 20m
          69.
          BigQuery IO does not support bytes in Python 3 Sub-task Closed Juta Staes

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 20h 50m
          70.
          Add Streaming wordcount test to Dataflow ValidatesContainer test suite Sub-task Open Unassigned  
          71.
          python 3 test_hourly_team_score_it fails with bigquery job id already exists Sub-task Closed Unassigned  
          72.
          test_multimap_side_input in fn_api.runner_test fails on Python 3.6 Sub-task Closed Robbe  
          73.
          Add Python3 performance benchmarks Sub-task Resolved Mark Liu

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 16h 10m
          74.
          Configurable Python interpreter version in Gradle Sub-task Closed Mark Liu

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2h
          75.
          Design Py3-compatible typehints annotation support in Beam 3. Sub-task Open Udi Meiri  
          76.
          Enable use_fastavro experiment on Dataflow Runner for all Py3 jobs. Sub-task Resolved Frederik Bode  
          77.
          Add DirectRunnerIT test suite to Python3 Postcommit suite. Sub-task Closed Juta Staes  
          78.
          TypeError caused by using str variable as header argument in apache_beam.io.textio.WriteToText Sub-task Resolved yoshiki obata

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2h
          79.
          Rename ToStringCoder into ToBytesCoder Sub-task Open Unassigned  
          80.
          Dataflow runner should set use_fastavro experiment on Python 3. Sub-task Resolved Valentyn Tymofieiev

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 2h 40m
          81.
          Enable Python3 tests for Flink Sub-task Open Unassigned  
          82.
          Enable Python3 tests for Spark Sub-task Open Kyle Weaver  
          83.
          Support Py3 Dataclasses Sub-task Open Unassigned  
          84.
          Revise BQ integration tests to clearly communicate that BQ IO expects base64-encoded bytes.  Sub-task Resolved Juta Staes  
          85.
          apache_beam.io.avroio_test.TestAvro.test_dynamic_work_rebalancing_exhaustive is very slow Sub-task Resolved Valentyn Tymofieiev

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 40m
          86.
          Clean up Python 2 codepaths once Beam no longer supports Python 2. Sub-task Open Unassigned  
          87.
          FastAvroTest has slow test_dynamic_exhaustive on Python 2 and 3. Sub-task Closed Unassigned  
          88.
          Create a Wordcount-on-Flink Python 3 test suite. Sub-task Resolved Valentyn Tymofieiev

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 5h 10m
          89.
          Document Python 3 support in Beam starting from 2.14.0 Sub-task Open Rose Nguyen  
          90.
          Add Python 3.6, 3.7 as supported qualifiers to setup.py. Sub-task Open Valentyn Tymofieiev

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 10m
          91.
          Improve Avro IO integration test coverage on Python 3. Sub-task Open Frederik Bode

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h
          92.
          Add smoke integration tests to Precommit test suites on Python 3 Sub-task Closed Valentyn Tymofieiev

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h
          93.
          Add SDK harness containers for Py 3.6, Py 3.7 Sub-task Open Unassigned  
          94.
          deadlock using save_main_session and logging Sub-task Open Valentyn Tymofieiev  

            Activity

              People

              • Assignee:
                RobbeSneyders Robbe
                Reporter:
                eyad.alsibai@gmail.com Eyad Sibai
              • Votes:
                38 Vote for this issue
                Watchers:
                61 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 299h 20m
                  299h 20m