Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-12223

ArrayData buffers are inconsistent accross implementations

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Not A Bug
    • None
    • None
    • C++, JavaScript, Rust
    • None

    Description

      ArrayData implementations seems to share close structure fields accross languages, but their usage is not consistent accross implementation.

       

      Example using ListArray's offsets buffer, in C++, Rust and JavaScript implementation:

       - C++: offset's buffer is the second buffer (validity bitmap is first buffer, and buffers are laid in a type-dependant way) https://github.com/apache/arrow/blob/master/cpp/src/arrow/array/array_nested.cc#L189

       - Rust: offset's buffer is the first buffer (validity bitmap is not part of the collection, and buffers are laid in a type-dependant way) https://github.com/apache/arrow/blob/master/rust/arrow/src/array/array_list.rs#L235

       - JavaScript: offset's buffer is the first buffer (they have fixed position) https://github.com/apache/arrow/blob/8e43f23dcc6a9e630516228f110c48b64d13cec6/js/src/data.ts#L125

       

      Note that we have the same inconsistency for validity and data buffers.

       

      This is important in my project because I would like to transport buffers list accross technologies, and ArrayData seemed the easiest structure to transport.

      Attachments

        Activity

          People

            Unassigned Unassigned
            vincent-tr Vincent Trumpff
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: