Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-6986

ValidateRecord should optionally validate if nullable fields are present

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Extensions
    • None

    Description

      Currently, if a field is nullable according to the schema, ValidateRecord considers the record to be valid, even if the field is missing completely. For some use cases, this is desirable. For example, it is common to drop fields in JSON when the field's value is null, because it can drastically reduce the size of the JSON.

      However, in other use cases, this is not desirable. For example, in a CSV file, we may want to require that there are the appropriate number of fields in a Record. It may be acceptable, for instance to have a line like "1234, John Smith, , , ," but not to have a line like "1234, John Smith".

      ValidateRecord should be updated with a new Property: "Allow Missing Null Values". If the value is `true` (the default, to avoid changing behavior between versions), the Processor should behave as it does now, where the absence of the field is synonymous with a null value. In this case, a line like "1234, John Smith" would be valid when the CSV is expecting 6 fields, as long as the last 4 fields are nullable.

      But if the value of this new property is `false`, the Processor should require that all fields be present in the data, even if the field has a null value. In this case, a line like "1234, John Smith" would be invalid if the CSV were expected to contain 6 fields.

      The `WriteJsonResult` class has a method in it: `private boolean isFieldPresent(RecordField field, Record record)`. This method should really exist on `Record` itself with a slightly different signature: `boolean isFieldPresent(RecordField field)`. It should have a default implementation provided, akin to the implementation in `WriteJsonResult` and then `WriteJsonResult` should simply use that method.

      `StandardSchemaValidator` should then be updated to use this to validate that records have all required fields, as configured. `SchemaValidationContext` should then be updated also to indicate whether or not the presence of null values should be validated.

      Attachments

        Issue Links

          Activity

            People

              Absolutesantaja Shawn Weeks
              markap14 Mark Payne
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m