Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-18430

[Python] Cannot cast nested nullable field to not-nullable

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 10.0.1
    • None
    • Python
    • None

    Description

      Casting from nullable field to not-nullable works provided all values are present. So for example this is a valid cast:

      table = pa.table({'column_1': pa.array([1, 2 ,3])})table.cast(
          pa.schema([
              f.with_nullable(False) for f in table.schema
          ])
      )

      But it doesn't work for nested field. Here's an example:

      import pyarrow as pa
      
      record = {"nested_int": 1}
      
      data_type = pa.struct(
          [
              pa.field("nested_int", pa.int32(), nullable=True),
          ]
      )
      
      data_type_after = pa.struct(
          [
              pa.field("nested_int", pa.int32(), nullable=False),
          ]
      )
      
      table = pa.table({"column_1": pa.array([record], data_type)})
      
      table.cast(pa.schema([pa.field("column_1", data_type_after)])) 

      Throws:

      pyarrow.lib.ArrowTypeError: cannot cast nullable field to non-nullable field: struct<nested_int: int32> struct<nested_int: int32 not null> 

      This is somewhat related to https://github.com/apache/arrow/issues/13177 and https://issues.apache.org/jira/browse/ARROW-16603 

      Attachments

        Activity

          People

            Unassigned Unassigned
            0x26dres &res
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: