Details
-
Bug
-
Status: Open
-
P2
-
Resolution: Unresolved
-
2.35.0, 2.36.0, 2.37.0, 2.38.0
-
None
-
None
-
OS: Linux
Python 3.8.12
Description
We have discovered a potential bug whereas when you execute a pipeline that contains
a DataframeTransform with the "runtime_type_check" option set to True, a cryptic
error is raised by Apache Beam typecheckng.
Simple example to reproduce the bug:
from apache_beam.options.pipeline_options import PipelineOptions from apache_beam import Pipeline, Create, Row from apache_beam.dataframe.transforms import DataframeTransform pipeline = Pipeline(options=PipelineOptions(runtime_type_check=True)) pipeline | Create([Row(val1=1)]) | DataframeTransform(lambda df: df) pipeline.run()
This raises a apache_beam.typehints.decorators.TypeCheckError:
File ".....lib/python3.8/site-packages/apache_beam/typehints/typehints.py", line 416, in check_constraint raise SimpleTypeHintError apache_beam.typehints.decorators.TypeCheckError: According to type-hint expected output should be of type <class 'apache_beam.typehints.schemas.BeamSchema_118086df_671f_4643_a929_ba65de48e7e8'>. Instead, received 'BeamSchema_118086df_671f_4643_a929_ba65de48e7e8(val1=1)', an instance of type <class 'apache_beam.typehints.schemas.BeamSchema_118086df_671f_4643_a929_ba65de48e7e8'>. [while running 'DataframeTransform/Unbatch 'placeholder_DataFrame_140623617251840'/ParDo(_UnbatchNoIndex)']