Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-7663

[Python] from_pandas gives TypeError instead of ArrowTypeError in some cases

    XMLWordPrintableJSON

Details

    Description

      from_pandas sometimes raises a TypeError with an uninformative error message rather than an ArrowTypeError with the full, informative type error for mixed-type array columns:

      >>> pa.Table.from_pandas(pd.DataFrame({"a": ['a', 1]}))
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File "pyarrow/table.pxi", line 1177, in pyarrow.lib.Table.from_pandas
        File "/Users/lidavidm/Flight/arrow/build/python/lib.macosx-10.12-x86_64-3.7/pyarrow/pandas_compat.py", line 575, in dataframe_to_arrays
          for c, f in zip(columns_to_convert, convert_fields)]
        File "/Users/lidavidm/Flight/arrow/build/python/lib.macosx-10.12-x86_64-3.7/pyarrow/pandas_compat.py", line 575, in <listcomp>
          for c, f in zip(columns_to_convert, convert_fields)]
        File "/Users/lidavidm/Flight/arrow/build/python/lib.macosx-10.12-x86_64-3.7/pyarrow/pandas_compat.py", line 566, in convert_column
          raise e
        File "/Users/lidavidm/Flight/arrow/build/python/lib.macosx-10.12-x86_64-3.7/pyarrow/pandas_compat.py", line 560, in convert_column
          result = pa.array(col, type=type_, from_pandas=True, safe=safe)
        File "pyarrow/array.pxi", line 265, in pyarrow.lib.array
        File "pyarrow/array.pxi", line 80, in pyarrow.lib._ndarray_to_array
        File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status
      pyarrow.lib.ArrowTypeError: ("Expected a bytes object, got a 'int' object", 'Conversion failed for column a with type object')
      >>> pa.Table.from_pandas(pd.DataFrame({"a": [1, 'a']}))
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File "pyarrow/table.pxi", line 1177, in pyarrow.lib.Table.from_pandas
        File "/Users/lidavidm/Flight/arrow/build/python/lib.macosx-10.12-x86_64-3.7/pyarrow/pandas_compat.py", line 575, in dataframe_to_arrays
          for c, f in zip(columns_to_convert, convert_fields)]
        File "/Users/lidavidm/Flight/arrow/build/python/lib.macosx-10.12-x86_64-3.7/pyarrow/pandas_compat.py", line 575, in <listcomp>
          for c, f in zip(columns_to_convert, convert_fields)]
        File "/Users/lidavidm/Flight/arrow/build/python/lib.macosx-10.12-x86_64-3.7/pyarrow/pandas_compat.py", line 560, in convert_column
          result = pa.array(col, type=type_, from_pandas=True, safe=safe)
        File "pyarrow/array.pxi", line 265, in pyarrow.lib.array
        File "pyarrow/array.pxi", line 80, in pyarrow.lib._ndarray_to_array
      TypeError: an integer is required (got type str)
      

      Noticed on 0.15.1 and on master when we tried to upgrade. On 0.14.1, both cases gave ArrowTypeError.

      Attachments

        Issue Links

          Activity

            People

              arw2019 Andrew Wieteska
              lidavidm David Li
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 40m
                  2h 40m