Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-16832

duplicate ROW__ID possible in multi insert into transactional table

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.2.0
    • Fix Version/s: 3.0.0
    • Component/s: Transactions
    • Labels:
      None
    • Target Version/s:

      Description

       create table AcidTablePart(a int, b int) partitioned by (p string) clustered by (a) into 2 buckets stored as orc TBLPROPERTIES ('transactional'='true');
       create temporary table if not exists data1 (x int);
       insert into data1 values (1);
       from data1
         insert into AcidTablePart partition(p) select 0, 0, 'p' || x
         insert into AcidTablePart partition(p='p1') select 0, 1
      

      Each branch of this multi-insert create a row in partition p1/bucket0 with ROW__ID=(1,0,0).
      The same can happen when running SQL Merge (HIVE-10924) statement that has both Insert and Update clauses when target table has 'transactional'='true','transactional_properties'='default' (see HIVE-14035). This is so because Merge is internally run as a multi-insert statement.

      The solution relies on statement ID introduced in HIVE-11030. Each Insert clause of a multi-insert is gets a unique ID.
      The ROW__ID.bucketId now becomes a bit packed triplet (format version, bucketId, statementId).
      (Since ORC stores field names in the data file we can't rename ROW__ID.bucketId).
      This ensures that there are no collisions and retains desired sort properties of ROW__ID.
      In particular SortedDynPartitionOptimizer works w/o any changes even in cases where there fewer reducers than buckets.

        Attachments

        1. HIVE-16832.01.patch
          32 kB
          Eugene Koifman
        2. HIVE-16832.03.patch
          78 kB
          Eugene Koifman
        3. HIVE-16832.04.patch
          84 kB
          Eugene Koifman
        4. HIVE-16832.05.patch
          95 kB
          Eugene Koifman
        5. HIVE-16832.06.patch
          95 kB
          Eugene Koifman
        6. HIVE-16832.08.patch
          113 kB
          Eugene Koifman
        7. HIVE-16832.09.patch
          114 kB
          Eugene Koifman
        8. HIVE-16832.10.patch
          128 kB
          Eugene Koifman
        9. HIVE-16832.11.patch
          227 kB
          Eugene Koifman
        10. HIVE-16832.14.patch
          34 kB
          Eugene Koifman
        11. HIVE-16832.15.patch
          87 kB
          Eugene Koifman
        12. HIVE-16832.16.patch
          95 kB
          Eugene Koifman
        13. HIVE-16832.17.patch
          100 kB
          Eugene Koifman
        14. HIVE-16832.18.patch
          105 kB
          Eugene Koifman
        15. HIVE-16832.19.patch
          117 kB
          Eugene Koifman
        16. HIVE-16832.20.patch
          121 kB
          Eugene Koifman
        17. HIVE-16832.20.patch
          121 kB
          Eugene Koifman
        18. HIVE-16832.21.patch
          121 kB
          Eugene Koifman
        19. HIVE-16832.22.patch
          127 kB
          Eugene Koifman

          Issue Links

            Activity

              People

              • Assignee:
                ekoifman Eugene Koifman
                Reporter:
                ekoifman Eugene Koifman
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: