Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-9043

BigQueryIO fails cryptically if gcpTempLocation is set and tempLocation is not

Details

    • Bug
    • Status: Open
    • P3
    • Resolution: Unresolved
    • None
    • None
    • io-java-gcp
    • None

    Description

      The following error arises when running a pipeline that uses BigQueryIO with gcpTempLocation set and tempLocation not set. We should either handle this case gracefully, or throw a more helpful error like "please specify tempLocation".

      2019-12-24 13:06:18 WARN  UnboundedReadFromBoundedSource:152 - Exception while splitting org.apache.beam.sdk.io.gcp.bigquery.BigQueryQuerySource@5d21202d, skips the initial splits.
      java.lang.NullPointerException
              at java.util.regex.Matcher.getTextLength(Matcher.java:1283)
              at java.util.regex.Matcher.reset(Matcher.java:309)
              at java.util.regex.Matcher.<init>(Matcher.java:229)
              at java.util.regex.Pattern.matcher(Pattern.java:1093)
              at org.apache.beam.sdk.io.FileSystems.parseScheme(FileSystems.java:447)
              at org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:533)
              at org.apache.beam.sdk.io.gcp.bigquery.BigQueryHelpers.resolveTempLocation(BigQueryHelpers.java:706)
              at org.apache.beam.sdk.io.gcp.bigquery.BigQuerySourceBase.extractFiles(BigQuerySourceBase.java:125)
              at org.apache.beam.sdk.io.gcp.bigquery.BigQuerySourceBase.split(BigQuerySourceBase.java:148)
              at org.apache.beam.runners.core.construction.UnboundedReadFromBoundedSource$BoundedToUnboundedSourceAdapter.split(UnboundedReadFromBoundedSource.java:144)
              at org.apache.beam.runners.dataflow.internal.CustomSources.serializeToCloudSource(CustomSources.java:87)
              at org.apache.beam.runners.dataflow.ReadTranslator.translateReadHelper(ReadTranslator.java:51)
              at org.apache.beam.runners.dataflow.DataflowRunner$StreamingUnboundedRead$ReadWithIdsTranslator.translate(DataflowRunner.java:1590)
              at org.apache.beam.runners.dataflow.DataflowRunner$StreamingUnboundedRead$ReadWithIdsTranslator.translate(DataflowRunner.java:1587)
              at org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator.visitPrimitiveTransform(DataflowPipelineTranslator.java:475)
              at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:665)
              at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
              at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
              at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
              at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
              at org.apache.beam.sdk.runners.TransformHierarchy$Node.access$600(TransformHierarchy.java:317)
              at org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:251)
              at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:460)
              at org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator.translate(DataflowPipelineTranslator.java:414)
              at org.apache.beam.runners.dataflow.DataflowPipelineTranslator.translate(DataflowPipelineTranslator.java:173)
              at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:763)
              at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:186)
              at org.apache.beam.sdk.Pipeline.run(Pipeline.java:315)
              at org.apache.beam.sdk.Pipeline.run(Pipeline.java:301)
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            bhulette Brian Hulette
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: