Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-7755

BigQuery Repeated Records do not seem to work

Details

    • Bug
    • Status: Resolved
    • P2
    • Resolution: Fixed
    • 2.12.0, 2.13.0
    • 2.15.0
    • io-java-avro, io-java-gcp
    • None

    Description

      When translating BigQuery rows to beam rows, specifically using theĀ  BigQueryUtils.toBeamRow(record, beamSchema) method, REPEATEDĀ RECORDS causes an error. This seems to be caused that avro arrays are thought to only have primitive types but these are arrays with a ROW type:

      Caused by: java.lang.RuntimeException: ROW is not primitive type.
      	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.convertAvroPrimitiveTypes(BigQueryUtils.java:467)
      	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.convertAvroArray(BigQueryUtils.java:427)
      	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.convertAvroFormat(BigQueryUtils.java:373)
      	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.lambda$toBeamRow$2(BigQueryUtils.java:222)
      	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
      	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1376)
      	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
      	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
      	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
      	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
      	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
      	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.toBeamRow(BigQueryUtils.java:223)
      	at com.spotify.data.sql.RowSource.lambda$bigquery$120a5f9f$1(RowSource.java:55)
      	at org.apache.beam.sdk.io.gcp.bigquery.BigQuerySourceBase$1.apply(BigQuerySourceBase.java:242)
      	at org.apache.beam.sdk.io.gcp.bigquery.BigQuerySourceBase$1.apply(BigQuerySourceBase.java:235)
      	at org.apache.beam.sdk.io.AvroSource$AvroBlock.readNextRecord(AvroSource.java:597)
      	at org.apache.beam.sdk.io.BlockBasedSource$BlockBasedReader.readNextRecord(BlockBasedSource.java:209)
      	at org.apache.beam.sdk.io.FileBasedSource$FileBasedReader.advanceImpl(FileBasedSource.java:484)
      	at org.apache.beam.sdk.io.FileBasedSource$FileBasedReader.startImpl(FileBasedSource.java:479)
      	at org.apache.beam.sdk.io.OffsetBasedSource$OffsetBasedReader.start(OffsetBasedSource.java:249)
      	at org.apache.beam.runners.dataflow.worker.WorkerCustomSources$BoundedReaderIterator.start(WorkerCustomSources.java:601)
      

      Attachments

        Issue Links

          Activity

            People

              snallapa Sahith Nallapareddy
              snallapa Sahith Nallapareddy
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2.5h
                  2.5h