Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20901

running compactor when there is nothing to do produces duplicate data

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 4.0.0
    • 4.0.0-alpha-2
    • Transactions
    • None

    Description

      suppose we run minor compaction 2 times, via alter table

      The 2nd request to compaction should have nothing to do but I don't think there is a check for that.  It's visible in the context of HIVE-20823, where each compactor run produces a delta with new visibility suffix so we end up with something like

      target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands3-1541810844849/warehouse/t/
      
      ├── delete_delta_0000001_0000002_v0000019
      │   ├── _orc_acid_version
      │   └── bucket_00000
      ├── delete_delta_0000001_0000002_v0000021
      │   ├── _orc_acid_version
      │   └── bucket_00000
      ├── delta_0000001_0000001_0000
      │   ├── _orc_acid_version
      │   └── bucket_00000
      ├── delta_0000001_0000002_v0000019
      │   ├── _orc_acid_version
      │   └── bucket_00000
      ├── delta_0000001_0000002_v0000021
      │   ├── _orc_acid_version
      │   └── bucket_00000
      └── delta_0000002_0000002_0000
          ├── _orc_acid_version
          └── bucket_00000

      i.e. 2 deltas with the same write ID range

      this is bad.  Probably happens today as well but new run produces a delta with the same name and clobbers the previous one, which may interfere with writers

       

      need to investigate

       

      The issue (I think) is that AcidUtils.getAcidState() then returns both deltas as if they were distinct and it effectively duplicates data.  There is no data duplication - getAcidState() will not use 2 deltas with the same writeid range

       

       

      Attachments

        1. HIVE-20901.1.patch
          2 kB
          Abhishek Somani
        2. HIVE-20901.2.patch
          4 kB
          Abhishek Somani

        Issue Links

          Activity

            People

              asomani Abhishek Somani
              ekoifman Eugene Koifman
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: