Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-45671

Implement an option similar to corrupt record column in State Data Source Reader

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 4.0.0
    • None
    • Structured Streaming
    • None

    Description

      Querying against the state would be most likely failing if the underlying state file is corrupted. There may be another case that the binary data (raw) state store read from state file does not fit with state schema and ends up with exception/fatal error in runtime.

      (We can't catch the case where the data is loaded with incorrect schema if it does not throw an exception. We cannot add the schema for every data.)

      To handle above cases without failure, we want to provide state rows for valid rows, with also providing binary data for corrupted rows (like we do for CSV/JSON) if users specify an option.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              kabhwan Jungtaek Lim
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: