Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-2068

[C++] [Parquet] Use arrow compute to determine min/max of dictionaries (possibly other arrays?)

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • parquet-cpp
    • None

    Description

      parquet::Comparator is currently used to calculate the min & max values of an array.  This should be benchmarked against arrow::compute's MinMax kernel (once it supports all necessary data types).  The latter should be more aggressive with SIMD resulting in better performance.

      Even if there is no performance difference the MinMax kernel should be used when computing dictionary statistics as the current implementation requires making a copy of the dictionary values array (see ARROW-12513)

      Attachments

        Activity

          People

            Unassigned Unassigned
            westonpace Weston Pace
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: