Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-6607

SchemaCoder cannot encode row with null value in array

Details

    • Bug
    • Status: Resolved
    • P2
    • Resolution: Fixed
    • None
    • 2.11.0
    • sdk-java-core

    Description

      This example fails with a "cannot encode null Integer" exception:

      import com.google.common.io.ByteStreams;
      import org.apache.beam.sdk.schemas.Schema;
      import org.apache.beam.sdk.schemas.SchemaCoder;
      import org.apache.beam.sdk.values.Row;
      import java.io.IOException;
      import java.util.Collections;
      
      public class Main {
          public static void main(String[] args) throws IOException {
              Schema schema = Schema.builder()
                      .addField("a", Schema.FieldType.array(Schema.FieldType.INT32, true))
                      .build();
      
              Row row = Row.withSchema(schema).addValue(Collections.singletonList(null)).build();
      
              SchemaCoder.of(schema).encode(row, ByteStreams.nullOutputStream());
          }
      }

      Note that null in the array should be OK, as the nullable parameter to Schema.FieldType.array is true.

      An easy way of solving this could be to wrap inner coders with a NullableCoder, but a better way would probably to have something like a NullableIterableCoder that uses a bitset similarly to how the SchemaCoder encodes nullable fields.

      I'll probably take a stab at fixing this and making a pull request.

      Attachments

        Issue Links

          Activity

            People

              mpedersen Mike Pedersen
              mpedersen Mike Pedersen
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h