Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-14083

ReadFromBigquery examples throw pickling exception when using InteractiveRunner

Details

    • Bug
    • Status: Resolved
    • P2
    • Resolution: Fixed
    • 2.35.0, 2.36.0, 2.37.0
    • 2.38.0
    • None
    • Cloud Dataflow Workbench Notebook on GCP
      Apache Beam 2.37.0 Kernel for Python 3

    Description

      When using a combination of the python InteractiveRunner and beam.io.ReadFromBigquery, the canonical examples from the beam python tutorials for BigQuery trigger and exception that appears to result from failing to serialize generators:

      notebook.py
      pipeline = beam.Pipeline(InteractiveRunner(), options=options)
      max_temperatures = (
          pipeline
          | 'QueryTableStdSQL' >> beam.io.ReadFromBigQuery(
              query='SELECT max_temperature FROM '\
                    '`clouddataflow-readonly.samples.weather_stations`',
              use_standard_sql=True, gcs_location=gcs_location)
          # Each row is a dictionary where the keys are the BigQuery columns
          | beam.Map(lambda elem: elem['max_temperature']))
      pipeline.run()
      
      ~/apache-beam-2.37.0/lib/python3.7/site-packages/apache_beam/coders/coders.py in <lambda>(x)
          800     protocol = pickle.HIGHEST_PROTOCOL
          801     return coder_impl.CallbackCoderImpl(
      --> 802         lambda x: dumps(x, protocol), pickle.loads)
          803 
          804   def as_deterministic_coder(self, step_label, error_message=None):
      
      TypeError: can't pickle generator objects [while running '[6]: QueryTableStdSQL/Read/SDFBoundedSourceReader/ParDo(SDFBoundedSourceDoFn)/SplitAndSizeRestriction']
      

      The interactive pipeline works as expected in version 2.34

      Attachments

        Issue Links

          Activity

            People

              ningk Ning
              mgthesecond Mark Grey
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: