Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-23712

metadata-only queries return incorrect results with empty acid partition

    XMLWordPrintableJSON

Details

    Description

      Similarly to HIVE-15397, queries can return incorrect results for metadata-only queries, here is a repro scenario which affects master:

      set hive.support.concurrency=true;
      set hive.exec.dynamic.partition.mode=nonstrict;
      set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
      
      set hive.optimize.metadataonly=true;
      
      create table test1 (id int, val string) partitioned by (val2 string) STORED AS ORC TBLPROPERTIES ('transactional'='true');
      describe formatted test1;
      
      alter table test1 add partition (val2='foo');
      alter table test1 add partition (val2='bar');
      insert into test1 partition (val2='foo') values (1, 'abc');
      select distinct val2, current_timestamp from test1;
      insert into test1 partition (val2='bar') values (1, 'def');
      delete from test1 where val2 = 'bar';
      
      select '--> hive.optimize.metadataonly=true';
      select distinct val2, current_timestamp from test1;
      
      
      set hive.optimize.metadataonly=false;
      select '--> hive.optimize.metadataonly=false';
      select distinct val2, current_timestamp from test1;
      
      select current_timestamp, * from test1;
      

      in this case 2 rows returned instead of 1 after a delete with metadata only optimization:
      https://github.com/abstractdog/hive/commit/a7f03513564d01f7c3ba4aa61c4c6537100b4d3f#diff-cb23043000831f41fe7041cb38f82224R114-R128

      Attachments

        Issue Links

          Activity

            People

              abstractdog László Bodor
              abstractdog László Bodor
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 50m
                  50m