Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-6882

[Python] cannot create a chunked_array from dictionary_encoding result

    XMLWordPrintableJSON

Details

    Description

      I've experienced a strange error raise when trying to apply `pa.chunked_array` directly on the indices of dictionary_encoding (code is below). Making a memory view solves the problem.

      import pyarrow as pa
      ca = pa.array(['a', 'a', 'b', 'b', 'c'])                                                                                           
      fca = ca.dictionary_encode()                                                                                                       
      fca.indices                                                                                                                        
      <pyarrow.lib.Int32Array object at 0x1250fb888>
      [
        0,
        0,
        1,
        1,
        2
      ]
      
      pa.chunked_array([fca.indices])                                                                                                    
      ---------------------------------------------------------------------------
      ArrowInvalid                              Traceback (most recent call last)
      <ipython-input-44-71ca3b877e1c> in <module>
      ----> 1 pa.chunked_array([fca.indices])
      
      ~/Projects/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pyarrow/table.pxi in pyarrow.lib.chunked_array()
      
      ~/Projects/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pyarrow/error.pxi in pyarrow.lib.check_status()
      
      ArrowInvalid: Unexpected dictionary values in array of type int32
      
      # with another memory view it's  OK
      pa.chunked_array([fca.indices.view(fca.indices.type)])                 
      Out[45]: 
      <pyarrow.lib.ChunkedArray object at 0x12508dc78>
      [
        [
          0,
          0,
          1,
          1,
          2
        ]
      ]
       

      Attachments

        Issue Links

          Activity

            People

              jorisvandenbossche Joris Van den Bossche
              ArtemK Artem KOZHEVNIKOV
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m