Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-3221 Model pipeline representation improvements
  3. BEAM-3204

Coders only should have a FunctionSpec, not an SdkFunctionSpec

Details

    • Sub-task
    • Status: Resolved
    • P2
    • Resolution: Fixed
    • None
    • 2.14.0
    • beam-model

    Description

      We added environments to coders to account for "custom" coders where it is only really possible for one SDK to understand them, like this:

      Coder {
        spec: SdkFunctionSpec {
          environment: "java_sdk_docker_container",
          spec: FunctionSpec {
            urn: "beam:coder:java_custom_coder",
            payload: <serialized java bytes>
          }
        }
      }
      

      But a coder must be understood by both the producer of a PCollection and its consumers. A coder is not the same as other UDF, though these are user-defined.

      A pipeline where either the producer or consumer cannot handle the coder is invalid, and we will have to build our cross-language APIs to prevent construction of such a pipeline. So we can drop the environment.

      I think there are some folks who want to reserve the ability to add an environment later, perhaps, to not pain ourselves into a corner. In this case, we can just add a field to Coder.

      Attachments

        Issue Links

          Activity

            People

              lcwik Luke Cwik
              kenn Kenneth Knowles
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m