Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-27924

Incremental rebuild goes wrong when inserts and deletes overlap between the source tables

    XMLWordPrintableJSON

Details

    Description

      Summary

      The incremental rebuild plan and execution output are incorrect when one side of the table join has inserted/deleted join keys that the other side has deleted/inserted (note the order).

      The argument is that tuples that have never been present simultaneously should not interact with one another, i.e., one's inserts should not join the other's deletes.

      Related Test Case

      The bug was discovered during replication of the test case:

      hive/ql/src/test/queries/clientpositive/materialized_view_create_rewrite_5.q

      Steps to Reproduce the Issue

      1. Configurations:
        SET hive.vectorized.execution.enabled=false;
        set hive.support.concurrency=true;
        set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
        set hive.strict.checks.cartesian.product=false;
        set hive.materializedview.rewriting=true;
      2. create table cmv_basetable_n6 (a int, b varchar(256), c decimal(10,2), d int) stored as orc TBLPROPERTIES ('transactional'='true'); 
      3. insert into cmv_basetable_n6 values
        (1, 'alfred', 10.30, 2),
        (1, 'charlie', 20.30, 2); 
      4. create table cmv_basetable_2_n3 (a int, b varchar(256), c decimal(10,2), d int) stored as orc TBLPROPERTIES ('transactional'='true'); 
      5. insert into cmv_basetable_2_n3 values
        (1, 'bob', 30.30, 2),
        (1, 'bonnie', 40.30, 2);
      6. CREATE MATERIALIZED VIEW cmv_mat_view_n6 TBLPROPERTIES ('transactional'='true') AS
        SELECT cmv_basetable_n6.a, cmv_basetable_2_n3.c
        FROM cmv_basetable_n6 JOIN cmv_basetable_2_n3 ON (cmv_basetable_n6.a = cmv_basetable_2_n3.a)
        WHERE cmv_basetable_2_n3.c > 10.0;
      7. show tables; 

      8. Select tuples, including deletion and with VirtualColumn's, from the MV and source tables. We see that the MV is correctly built upon creation:
        SELECT ROW__IS__DELETED, ROW__ID, * FROM cmv_mat_view_n6('acid.fetch.deleted.rows'='true');

      9. SELECT ROW__IS__DELETED, ROW__ID, * FROM cmv_basetable_n6('acid.fetch.deleted.rows'='true'); 

      10. SELECT ROW__IS__DELETED, ROW__ID, * FROM cmv_basetable_2_n3('acid.fetch.deleted.rows'='true'); 

      11. Now make an insert to the LHS and a delete to the RHS source table:
        insert into cmv_basetable_n6 values
        (1, 'kevin', 50.30, 2);
        
        DELETE FROM cmv_basetable_2_n3 WHERE b = 'bonnie';
      12. Select again to see what happened:
        SELECT ROW__IS__DELETED, ROW__ID, * FROM cmv_basetable_n6('acid.fetch.deleted.rows'='true'); 

      13. SELECT ROW__IS__DELETED, ROW__ID, * FROM cmv_basetable_2_n3('acid.fetch.deleted.rows'='true'); 

      14. Use EXPLAIN CBO to produce the incremental rebuild plan for the MV, which is incorrect already:
        EXPLAIN CBO
        ALTER MATERIALIZED VIEW cmv_mat_view_n6 REBUILD; 

      15. Rebuild MV and see (incorrect) results:
        ALTER MATERIALIZED VIEW cmv_mat_view_n6 REBUILD;
        
        SELECT ROW__IS__DELETED, ROW__ID, * FROM cmv_mat_view_n6('acid.fetch.deleted.rows'='true');

      16. Run MV definition directly, which outputs incorrect results because the MV is enabled for MV-based query rewrite, i.e., the following query will output what's in the MV for the time being:
        SELECT cmv_basetable_n6.a, cmv_basetable_2_n3.c
        FROM cmv_basetable_n6 JOIN cmv_basetable_2_n3 ON (cmv_basetable_n6.a = cmv_basetable_2_n3.a)
        WHERE cmv_basetable_2_n3.c > 10.0; 

      17. Disable MV-based query rewrite for the MV and re-run the definition, which should give the correct results:
        ALTER MATERIALIZED VIEW cmv_mat_view_n6 DISABLE REWRITE;
        
        SELECT cmv_basetable_n6.a, cmv_basetable_2_n3.c
        FROM cmv_basetable_n6 JOIN cmv_basetable_2_n3 ON (cmv_basetable_n6.a = cmv_basetable_2_n3.a)
        WHERE cmv_basetable_2_n3.c > 10.0;

      Note

      This issue is also seen in update-incurred inserts/deletes.

      Attachments

        1. 截图.PNG
          3 kB
          Wenhao Li
        2. 截图1.PNG
          15 kB
          Wenhao Li
        3. 截图2.PNG
          11 kB
          Wenhao Li
        4. 截图3.PNG
          11 kB
          Wenhao Li
        5. 截图4.PNG
          15 kB
          Wenhao Li
        6. 截图5.PNG
          11 kB
          Wenhao Li
        7. 截图6.PNG
          43 kB
          Wenhao Li
        8. 截图7.PNG
          27 kB
          Wenhao Li
        9. 截图8.PNG
          4 kB
          Wenhao Li
        10. 截图9.PNG
          3 kB
          Wenhao Li

        Issue Links

          Activity

            People

              kkasa Krisztian Kasa
              wenhaoli Wenhao Li
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: