Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-6719

Parquet read_table error in Python3.7: pyarrow.lib.ArrowInvalid: Column data for field with type list<...> is inconsistent with schema list<...>

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Duplicate
    • 0.15.0
    • None
    • None
    • Python 3.7

    Description

      I have Parquet files with certain complex columns of type List<item: double>, List<item: string>, etc. and am using latest PyArrow (0.15.0) to process them.

      In Python 3.7, the same pyarrow.parquet.read_table(...) function calls return errors of the following kind:

      "pyarrow.lib.ArrowInvalid: Column data for field 0 with type list<item: double> is inconsistent with schema list<element: double>"

      This issue might be related to https://issues.apache.org/jira/browse/ARROW-6068

      Attachments

        1. read-fail.snappy.parquet
          96 kB
          V Luong

        Issue Links

          Activity

            People

              Unassigned Unassigned
              MBALearnsToCode V Luong
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: