Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-8732

Add support for mapping additional structured types to Python Schemas


    • New Feature
    • Status: Open
    • P3
    • Resolution: Unresolved
    • None
    • None
    • sdk-py-core
    • None


      Currently we can convert between a NamedTuple type and its Schema protos using named_tuple_from_schema and named_tuple_to_schema. I'd like to introduce a system to support additional types, starting with structured types like attrs, dataclasses, and TypedDict.

      I've only just started digesting the code, but this task seems pretty straightforward. For example, I think the type-to-schema code would look roughly like this:

      def typing_to_runner_api(type_):
        # type: (Type) -> schema_pb2.FieldType
        structured_handler = _get_structured_handler(type_)
        if structured_handler:
          schema = None
          if hasattr(type_, 'id'):
            schema = SCHEMA_REGISTRY.get_schema_by_id(type_.id)
          if schema is None:
            fields = structured_handler.get_fields()
            type_id = str(uuid4())
            schema = schema_pb2.Schema(fields=fields, id=type_id)
            SCHEMA_REGISTRY.add(type_, schema)
          return schema_pb2.FieldType(

      The rest of the work would be in implementing a class hierarchy for working with structured types, such as getting a list of fields from an instance, and instantiation from a list of fields. Eventually we can extend this behavior to arbitrary, unstructured types.  

      Going in the schema-to-type direction, we have the problem of choosing which type to use for a given schema. I believe that as long as typing_to_runner_api() has been called on our structured type in the current python session, it should be added to the registry and thus round trip ok, so I think we just need a public function for registering schemas for structured types.

      bhulette Did you want to tackle this or are you ok with me going after it?



        Issue Links



              Unassigned Unassigned
              chadrik Chad Dombrova
              0 Vote for this issue
              5 Start watching this issue

