Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-7703

[C++][Dataset] Give more informative error message for mismatching schemas for FileSystemSources

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • C++, Python

    Description

      Currently, if you try to create a dataset from files with different schemes, you get this error:

      ArrowInvalid: Unable to merge: Field a has incompatible types: int64 vs int32
      

      If you are reading a directory of files, it would be very helpful if the error message can indicate which files are involved here (eg if you have a lot of files and only one has an error).

      You can already inspect the schema's if you first make a SourceFactory manually, but that also only gives a list of schema's, not mapped to the original file (this last item probably relates to ARROW-7608

      Attachments

        Activity

          People

            Unassigned Unassigned
            jorisvandenbossche Joris Van den Bossche
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: