Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-6373

Distinct optimization produces broken sql query

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.36.0
    • None
    • core, jdbc-adapter
    • None

    Description

      Can be reproduced with the following test in JDBCTests:

          String statement = "SELECT\n" +
                             "  DISTINCT \"case-column\"\n" +
                             "FROM (\n" +
                             "   SELECT \n" +
                             "   CASE\n" +
                             "      WHEN CAST(? AS VARCHAR) = \"ENAME\" THEN CAST(? AS INTEGER)\n" +
                             "      ELSE CAST(? AS INTEGER)\n" +
                             "      END AS \"case-column\"\n" +
                             "   FROM \"EMP\")";
      
          CalciteAssert.model(JdbcTest.SCOTT_MODEL)
              .query(statement)
              .consumesPreparedStatement(p -> {
                p.setString(1, "name");
                p.setInt(2, 2);
                p.setInt(3, 1);
              })
              .planHasSql("");
      

      Lets assume the following statement is passed through calcite:

      Before:

      SELECT
        DISTINCT "case-column"
      FROM
         SELECT 
         CASE
            WHEN CAST(? AS VARCHAR) = "store_name" THEN CAST(? AS INTEGER)
            ELSE CAST(? AS INTEGER)
            END AS "case-column"
         FROM "foodmart"."store"
      

      After:

      SELECT
         CASE
            WHEN ? = "ENAME" THEN ?
            ELSE ?
         END AS "case-column"
      FROM
         "SCOTT"."EMP"
      GROUP BY
         CASE
            WHEN ? = "ENAME" THEN ?
            ELSE ?
         END
      

      The produced statement hast two issues:
      1. The missing casts (see also https://issues.apache.org/jira/browse/CALCITE-6346)
      2. Instead of pushing the DISTINCT it is replaced with a GROUP BY. This is usually fine but since the field is a case statement containing dynamic parameters it is not.
      During sql syntax evaluation the database will give an error (the field in the select is not contained in group by). This is because the dynamic parameters are not evaluated during sql syntax evaluation.

      I think this could be fixed by adding an alias to the field in the select clause and referencing it in the group by clause instead of duplicating the case statement and the dynamic parameters.

      Attachments

        Activity

          People

            Unassigned Unassigned
            corvinkuebler Corvin Kuebler
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: