Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6949

Query fails with "UNSUPPORTED_OPERATION ERROR: Hash-Join can not partition the inner data any further" when Semi join is enabled

    XMLWordPrintableJSON

Details

    Description

      Following query fails when with Error: UNSUPPORTED_OPERATION ERROR: Hash-Join can not partition the inner data any further (probably due to too many join-key duplicates) on TPC-H SF100 data.

      
      set `exec.hashjoin.enable.runtime_filter` = true;
      set `exec.hashjoin.runtime_filter.max.waiting.time` = 10000;
      set `planner.enable_broadcast_join` = false;
      
      
      select
       count(*)
      from
       lineitem l1
      where
       l1.l_discount IN (
       select
       distinct(cast(l2.l_discount as double))
       from
       lineitem l2);
      
      reset `exec.hashjoin.enable.runtime_filter`;
      reset `exec.hashjoin.runtime_filter.max.waiting.time`;
      reset `planner.enable_broadcast_join`;
      
      

      The subquery contains distinct keyword and hence there should not be duplicate values.

      I suspect that the failure is caused by semijoin because the query succeeds when semijoin is disabled explicitly.
       

       

      Attachments

        Activity

          People

            ben-zvi Boaz Ben-Zvi
            aravi5 Abhishek Ravi
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: