Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-26572

Support constant expressions in vectorization

    XMLWordPrintableJSON

Details

    Description

      At the moment, we cannot vectorize aggregate expression having constant parameters in addition to the aggregation column (it's forbidden here).

      One compelling example of how this could help is PR 1824, linked to HIVE-24510, where compute_bit_vector had to be split into compute_bit_vector_hll + compute_bit_vector_fm when HLL implementation has been added, while compute_bit_vector($col, ['HLL'|'FM']) could have been used.

      Another example is VectorUDAFBloomFilterMerge, receiving an extra constant parameter controlling the number of threads for merging tasks. At the moment this parameter is "injected" when trying to find an appropriate constructor (see VectorGroupByOperator.java#L1224-L1244).

      This ad-hoc approach is not scalable and would make the code hard to read and maintain if more UDAFs require constant parameters.

      In addition, we are probably missing vectorization opportunities if no such ad-hoc treatment is added but an appropriate UDAF constructor is available or could be easily added (data sketches UDAF, although not yet vectorized, are a good target).

      Attachments

        Issue Links

          Activity

            People

              asolimando Alessandro Solimando
              asolimando Alessandro Solimando
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: