Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-9154

Move Chicago Taxi Example to Python 3

Details

    • Improvement
    • Status: Open
    • P4
    • Resolution: Unresolved
    • None
    • None
    • testing
    • None

    Description

      The Chicago Taxi Example[1] should be moved to the latest version of Python supported by Beam (currently it's Python 3.7).

      At the moment, the following error occurs when running the benchmark on Python 3.7 (requires futher investigation):

      Traceback (most recent call last):
        File "preprocess.py", line 259, in <module>
          main()
        File "preprocess.py", line 254, in main
          project=known_args.metric_reporting_project
        File "preprocess.py", line 155, in transform_data
          ('Analyze' >> tft_beam.AnalyzeDataset(preprocessing_fn)))
        File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/transforms/ptransform.py", line 987, in __ror__
          return self.transform.__ror__(pvalueish, self.label)
        File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/transforms/ptransform.py", line 547, in __ror__
          result = p.apply(self, pvalueish, label)
        File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/pipeline.py", line 532, in apply
          return self.apply(transform, pvalueish)
        File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/pipeline.py", line 573, in apply
          pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
        File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/runners/runner.py", line 193, in apply
          return m(transform, input, options)
        File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/runners/runner.py", line 223, in apply_PTransform
          return transform.expand(input)
        File "/Users/kamilwasilewski/proj/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/tensorflow_transform/beam/impl.py", line 825, in expand
          input_metadata))
        File "/Users/kamilwasilewski/proj/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/tensorflow_transform/beam/impl.py", line 716, in expand
          output_signature = self._preprocessing_fn(copied_inputs)
        File "preprocess.py", line 102, in preprocessing_fn
          _fill_in_missing(inputs[key]),
      KeyError: 'company'
      

      [1] sdks/python/apache_beam/testing/benchmarks/chicago_taxi

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              kamilwu Kamil Wasilewski
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 40m
                  2h 40m