Details
-
Improvement
-
Status: Open
-
P4
-
Resolution: Unresolved
-
None
-
None
-
None
Description
The Chicago Taxi Example[1] should be moved to the latest version of Python supported by Beam (currently it's Python 3.7).
At the moment, the following error occurs when running the benchmark on Python 3.7 (requires futher investigation):
Traceback (most recent call last): File "preprocess.py", line 259, in <module> main() File "preprocess.py", line 254, in main project=known_args.metric_reporting_project File "preprocess.py", line 155, in transform_data ('Analyze' >> tft_beam.AnalyzeDataset(preprocessing_fn))) File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/transforms/ptransform.py", line 987, in __ror__ return self.transform.__ror__(pvalueish, self.label) File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/transforms/ptransform.py", line 547, in __ror__ result = p.apply(self, pvalueish, label) File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/pipeline.py", line 532, in apply return self.apply(transform, pvalueish) File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/pipeline.py", line 573, in apply pvalueish_result = self.runner.apply(transform, pvalueish, self._options) File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/runners/runner.py", line 193, in apply return m(transform, input, options) File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/runners/runner.py", line 223, in apply_PTransform return transform.expand(input) File "/Users/kamilwasilewski/proj/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/tensorflow_transform/beam/impl.py", line 825, in expand input_metadata)) File "/Users/kamilwasilewski/proj/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/tensorflow_transform/beam/impl.py", line 716, in expand output_signature = self._preprocessing_fn(copied_inputs) File "preprocess.py", line 102, in preprocessing_fn _fill_in_missing(inputs[key]), KeyError: 'company'
[1] sdks/python/apache_beam/testing/benchmarks/chicago_taxi
Attachments
Issue Links
- is blocked by
-
BEAM-10422 Python Chicago Taxi Jobs are failing
- Open
- is related to
-
BEAM-8892 Move Chicago Taxi Example Benchmarks to Beam orchestrator
- Open
-
BEAM-7079 Run Chicago Taxi Example on Dataflow
- Resolved
-
BEAM-8108 Run Chicago Taxi Example on Flink
- Resolved
- links to