Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-7755

BigQuery Repeated Records do not seem to work

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.12.0, 2.13.0
    • Fix Version/s: 2.15.0
    • Component/s: io-java-avro, io-java-gcp
    • Labels:
      None

      Description

      When translating BigQuery rows to beam rows, specifically using theĀ  BigQueryUtils.toBeamRow(record, beamSchema) method, REPEATEDĀ RECORDS causes an error. This seems to be caused that avro arrays are thought to only have primitive types but these are arrays with a ROW type:

      Caused by: java.lang.RuntimeException: ROW is not primitive type.
      	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.convertAvroPrimitiveTypes(BigQueryUtils.java:467)
      	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.convertAvroArray(BigQueryUtils.java:427)
      	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.convertAvroFormat(BigQueryUtils.java:373)
      	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.lambda$toBeamRow$2(BigQueryUtils.java:222)
      	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
      	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1376)
      	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
      	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
      	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
      	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
      	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
      	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.toBeamRow(BigQueryUtils.java:223)
      	at com.spotify.data.sql.RowSource.lambda$bigquery$120a5f9f$1(RowSource.java:55)
      	at org.apache.beam.sdk.io.gcp.bigquery.BigQuerySourceBase$1.apply(BigQuerySourceBase.java:242)
      	at org.apache.beam.sdk.io.gcp.bigquery.BigQuerySourceBase$1.apply(BigQuerySourceBase.java:235)
      	at org.apache.beam.sdk.io.AvroSource$AvroBlock.readNextRecord(AvroSource.java:597)
      	at org.apache.beam.sdk.io.BlockBasedSource$BlockBasedReader.readNextRecord(BlockBasedSource.java:209)
      	at org.apache.beam.sdk.io.FileBasedSource$FileBasedReader.advanceImpl(FileBasedSource.java:484)
      	at org.apache.beam.sdk.io.FileBasedSource$FileBasedReader.startImpl(FileBasedSource.java:479)
      	at org.apache.beam.sdk.io.OffsetBasedSource$OffsetBasedReader.start(OffsetBasedSource.java:249)
      	at org.apache.beam.runners.dataflow.worker.WorkerCustomSources$BoundedReaderIterator.start(WorkerCustomSources.java:601)
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                snallapa Sahith Nallapareddy
                Reporter:
                snallapa Sahith Nallapareddy
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2.5h
                  2.5h