Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-24466

insert queries should not launch job when condition in the query would output 0 rows

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      -- This query would not generate any output and does not launch a job
      select * from tpcds_bin_partitioned_orc_30000.store_sales where 1 = 2;
      
      
      -- This query generates a job (M -> R -> R) and runs for 30+ seconds in 2 node cluster to generate 0 rows.
      
      insert into table delete_orc_10.test_sales_1 select * from tpcds_bin_partitioned_orc_30000.store_sales where 1 = 2;
      
      insert overwrite table delete_orc_10.test_sales_1 select * from tpcds_bin_partitioned_orc_30000.store_sales where ss_sold_date_sk >=2450816+300 and ss_sold_date_sk <= (2450816+100);
      
      
      INFO  : Status: Running (Executing on YARN cluster with App id application_1606875286859_0001)
      
      ----------------------------------------------------------------------------------------------
              VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
      ----------------------------------------------------------------------------------------------
      Map 1 ..........      llap     SUCCEEDED      1          1        0        0       0       0
      Reducer 2 ......      llap     SUCCEEDED      2          2        0        0       0       5
      Reducer 3 ......      llap     SUCCEEDED      2          2        0        0       0       9
      ----------------------------------------------------------------------------------------------
      VERTICES: 03/03  [==========================>>] 100%  ELAPSED TIME: 28.61 s
      ----------------------------------------------------------------------------------------------
      INFO  : Status: DAG finished successfully in 18.72 seconds
      INFO  :
      INFO  : Query Execution Summary
      INFO  : ----------------------------------------------------------------------------------------------
      INFO  : OPERATION                            DURATION
      INFO  : ----------------------------------------------------------------------------------------------
      INFO  : Compile Query                          14.06s
      INFO  : Prepare Plan                            0.17s
      INFO  : Get Query Coordinator (AM)              0.14s
      INFO  : Submit Plan                             0.03s
      INFO  : Start DAG                               0.05s
      INFO  : Run DAG                                18.72s
      INFO  : ----------------------------------------------------------------------------------------------
      
       

      It would be good to stop launching the job, when the condition is not valid in the query.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              rajesh.balamohan Rajesh Balamohan
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: