Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-1317

Rewrite "MAX" as "ORDER BY ... LIMIT ..."

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      We could optimize

      SELECT *
      FROM emp
      WHERE empno = (SELECT max(empno) FROM emp)
      

      to

      SELECT *
      FROM emp
      ORDER BY empno DESC LIMIT 1
      

      (using the fact that empno is unique and non-NULL). Similarly, we can rewrite

      SELECT max(sal)
      FROM emp
      

      to

      SELECT sal
      FROM emp
      ORDER BY sal DESC LIMIT 1
      

      (not making any assumptions about whether sal is unique or allows NULL values) and we can rewrite a query to find the highest paid employee(s) in each department

      SELECT *
      FROM emp AS e
      WHERE sal = (
        SELECT max(sal)
        FROM emp AS e2
        WHERE e2.deptno = e.deptno)
      

      as

      SELECT deptno, empno, sal
      FROM (
        SELECT deptno, empno, sal, FIRST_VALUE(sal) OVER w AS topSal
        FROM emp
        WINDOW w AS (PARTITION BY deptno ORDER BY sal DESC))
      WHERE sal = topSal
      

      We might benefit from a generalized Sort(limit) operator that can find the top N within any prefix of the sort key, not just the top N overall.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              julianhyde Julian Hyde
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: