Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-6607

SchemaCoder cannot encode row with null value in array

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.11.0
    • Component/s: sdk-java-core
    • Labels:
      None

      Description

      This example fails with a "cannot encode null Integer" exception:

      import com.google.common.io.ByteStreams;
      import org.apache.beam.sdk.schemas.Schema;
      import org.apache.beam.sdk.schemas.SchemaCoder;
      import org.apache.beam.sdk.values.Row;
      import java.io.IOException;
      import java.util.Collections;
      
      public class Main {
          public static void main(String[] args) throws IOException {
              Schema schema = Schema.builder()
                      .addField("a", Schema.FieldType.array(Schema.FieldType.INT32, true))
                      .build();
      
              Row row = Row.withSchema(schema).addValue(Collections.singletonList(null)).build();
      
              SchemaCoder.of(schema).encode(row, ByteStreams.nullOutputStream());
          }
      }

      Note that null in the array should be OK, as the nullable parameter to Schema.FieldType.array is true.

      An easy way of solving this could be to wrap inner coders with a NullableCoder, but a better way would probably to have something like a NullableIterableCoder that uses a bitset similarly to how the SchemaCoder encodes nullable fields.

      I'll probably take a stab at fixing this and making a pull request.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                mpedersen Mike Pedersen
                Reporter:
                mpedersen Mike Pedersen
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h