Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-28196

Preserve column stats when applying UDF upper/lower.

    XMLWordPrintableJSON

Details

    Description

      Current Hive re-estimates column stats (including avgColLen) when it encounters UDF.
      In the case of upper and lower, Hive sets avgColLen to hive.stats.max.variable.length.
      But these UDFs do not change column stats and the default value(100) is too high for string type key columns, on which upper/lower are usually applied.

      This patch keeps input data's avgColLen after applying UDF upper/lower to make a better query plan.

      Attachments

        Issue Links

          Activity

            People

              seonggon Seonggon Namgung
              seonggon Seonggon Namgung
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: