Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-5226

Add support for arbitrary subqueries in WHERE/HAVING, including inside disjuncts

    XMLWordPrintableJSON

Details

    Description

      The following query will error with:
      ERROR: AnalysisException: Subqueries in OR predicates are not supported:

      select 
        c_customer_sk,
        c_first_name,
        c_last_name
      from customer
      where
       (exists (select *
                from web_sales, date_dim
                where c_customer_sk = ws_bill_customer_sk and
                      ws_sold_date_sk = d_date_sk and
                      d_date = '2002-01-01') 
        or 
        exists (select * 
                from catalog_sales, date_dim
                where c_customer_sk = cs_ship_customer_sk and
                      cs_sold_date_sk = d_date_sk and
                      d_date = '2002-01-01') 
      )
      order by 1;
      

      When there is a disjunction of subqueries, it can be transformed into a union all and a semi-join like such which is able to be run in impala today.

      with 
      v1 as (
        select 
           ws_bill_customer_sk as customer_sk
        from web_sales,
             date_dim
        where ws_sold_date_sk = d_date_sk
        and d_date = '2002-01-01'
        union all
        select 
          cs_ship_customer_sk as customer_sk
        from catalog_sales,
             date_dim 
        where cs_sold_date_sk = d_date_sk
        and d_date = '2002-01-01'
      )
      select 
        c_customer_sk,
        c_first_name,
        c_last_name
      from customer
      left semi join v1 on (customer_sk = c_customer_sk)
      order by 1
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              grahn Greg Rahn
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: