Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-12745

AvroReader silently drops record if it's malformed

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.0.0-M1, 1.18.0, 1.19.0, 1.20.0, 1.19.1, 1.21.0, 1.22.0, 1.23.0, 1.24.0, 1.23.1, 1.23.2, 1.25.0, 2.0.0-M2
    • 2.0.0-M3, 1.26.0
    • Core Framework
    • None

    Description

      See the attached example flow. It reproduces the issue very reliably.

      GenerateFlowFile is set to generate the following Json:

      [{
        "field_1" : 123456789,
        "field_2" : "44",
        "field_3" : 5
      }] 
      

      This input is converted to Avro format, using the ConvertRecord processor. The 'Schema Write Strategy' of AvroRecordSetWriter is set to anything different than 'Embed Avro Schema'.

      Then, the resulting FF is routed to a processor that uses an AvroReader to work on the records. The reader is set to use a predefined, fixed schema, which does not match with the input avro file, contains at least an extra field. It does not matter if that field has a default value or not.

      {
        "type":"record",
        "name":"message_name",
        "namespace":"message_namespace",
        "fields":[
          {
            "name":"field_1",
            "type":["long"]
          },
          {
            "name":"field_2",
            "type":["string"]
          },
          {
            "name":"field_3",
            "type":["int"]
          },
          {
            "name":"extra_field",
            "type":["string"],
            "default":"empty"
          }
        ]
      }
      

      When this processor consumes the input, the reader silently drops the record, without even making an error log message. At the processor level, this is equivalent to having no records to process, so nothing happens. The user won't notice that there is a misconfiguration somewhere until they start noticing the missing the flow files.

      The expected behavior from the processors would be to route the malformed input FF to their failure relationship, and shout an error on its bulletin.

      Attachments

        1. ValidateRecord.json
          47 kB
          Rajmund Takacs

        Issue Links

          Activity

            People

              takraj Rajmund Takacs
              takraj Rajmund Takacs
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 50m
                  50m