Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-12222

Dataflow side input translation "Unknown producer for value"

Details

    • Bug
    • Status: Resolved
    • P1
    • Resolution: Fixed
    • None
    • 2.30.0
    • runner-dataflow
    • None

    Description

      I have identified a seemingly nondeterministic issue in Dataflow translation, where pipelines with side inputs sometimes are translated in the wrong order.

      java.lang.NullPointerException: Unknown producer for value SimplePCollectionView{tag=Tag<org.apache.beam.sdk.values.PCollectionViews$SimplePCollectionView.<init>:1221#4dca087078898728>} while translating step TfIdf.ComputeTfIdf/Combine.globally(Count)/ProduceDefault
      	at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:1227)
      

      Seen on https://ci-beam.apache.org/job/beam_PostCommit_Java_Examples_Dataflow_V2_PR/32/testReport/junit/org.apache.beam.examples.complete/TfIdfIT/testE2ETfIdf/ and also other changes. I think the change itself is just triggering the nondeterministic problem.

      So there is a lurking problem with side inputs overall.

      Attachments

        Activity

          People

            kenn Kenneth Knowles
            kenn Kenneth Knowles
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 5h 50m
                5h 50m