Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4586

Values of non-deterministic UDFs are cached in backend

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • Impala 2.8.0
    • Impala 2.8.0
    • Backend

    Description

      This increases the severity of a pre-existing problem, where UDFs are always assumed to be deterministics, so UDFs with only constant arguments were cached or constant folded. In most cases in Impala 2.7.0, this had no effect, e.g. both f and g in f() + g() were re-evaluated for each input row.

      The below commit added caching of constant arguments to ScalarFnCall expressions (used for UDFs, builtin functions and various operators), so f() and g() would not be re-evaluated for each input row.

        commit 10fa472fa6aa036be02748ae54daed1722449c68
        Author: Tim Armstrong <tarmstrong@cloudera.com>
        Date:   Wed Oct 26 10:55:23 2016 -0700
      
            IMPALA-4302,IMPALA-2379: constant expr arg fixes
      

      The ideal solution is to provide syntax for UDF declarations that specifies whether it is deterministic. As a short-term workaround we could add a query option that assumes that all UDFs are non-deterministic.

      Attachments

        Activity

          People

            tarmstrong Tim Armstrong
            tarmstrong Tim Armstrong
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: