Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-7025

Python pipelines should not be able to use output tags that are not defined in with_outputs.

Details

    • New Feature
    • Status: Open
    • P3
    • Resolution: Unresolved
    • None
    • None
    • sdk-py-core
    • None

    Description

      This is an indication of a user misconfiguring a beam pipeline.

      This is because its not possible to get a handle to use the produced pcollection for that output tag, if .with_outputs is not used. So this should be disallowed entirely, a run time exception should be thrown.

      Note:
      The bundle descriptor knows which tags are available for each step. So at runtime it can be detected. But we need to be careful to not test it on every element, for performance purposes
       
      i suspect its possible to detect it statically, but may require collecting more information
       
      But there should be some code path already collects the elements for the bundle into the different tags when output at that point, at the end of bundle execution we can check for it which would be cheap

      Attachments

        Activity

          People

            Unassigned Unassigned
            ajamato@google.com Alex Amato
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: