Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-11861

ParquetIO throws Coder not found when using parseGenericRecord or parseFilesGenericRecord

Details

    • Bug
    • Status: Resolved
    • P2
    • Resolution: Fixed
    • 2.28.0
    • 2.29.0
    • io-java-parquet
    • None

    Description

      ParquetIO is missing output coder  when  using a user-specified parsing function with `parseGenericRecord` or `parseFilesGenericRecord` feature for reading Parquet files with unknown schema.

      Workaround:
      Use `setCoder` directly on the output `PCollection`
      for example:

      SerializableFunction<GenericRecord, Foo> parseFn = ...;
      Coder<Foo> fooCoder = ...;
      PCollection<Foo> records = p.apply(ParquetIO.parseGenericRecords(parseFn).from(...))
      .setCoder(fooCoder);

      Attachments

        Issue Links

          Activity

            People

              anantdamle Anant Damle
              anantdamle Anant Damle
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 50m
                  50m