Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-26018

The result of UNIQUEJOIN on Hive on Tez is inconsistent with that of MR

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.1.0, 4.0.0
    • None
    • Tez
    • None

    Description

      The result of UNIQUEJOIN on Hive on Tez is inconsistent with that of MR, and the result Is not correct, for example:

      CREATE TABLE T1_n1x(key STRING, val STRING) STORED AS orc;
      CREATE TABLE T2_n1x(key STRING, val STRING) STORED AS orc;

      insert into T1_n1x values('aaa', '111'),('bbb', '222'),('ccc', '333');
      insert into T2_n1x values('aaa', '111'),('ddd', '444'),('ccc', '333');

      SELECT a.key, b.key FROM UNIQUEJOIN PRESERVE T1_n1x a (a.key), PRESERVE  T2_n1x b (b.key);

      Hive on Tez result: wrong

      a.key   b.key  
      aaa     aaa    
      bbb     NULL  
      ccc     ccc    
      NULL   ddd    

      ------------------
      Hive on MR result: right

      a.key   b.key  
      aaa     aaa    
      bbb     NULL  
      ccc     ccc    

      -----------------

      SELECT a.key, b.key FROM UNIQUEJOIN T1_n1x a (a.key), T2_n1x b (b.key);

      Hive on Tez result: wrong

      -------------------

      a.key   b.key  
      aaa     aaa    
      bbb     NULL  
      ccc     ccc    
      NULL   ddd    

      -----------------

      Hive on MR result: right

      a.key   b.key  
      aaa     aaa    
      ccc     ccc    

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            luguangming GuangMing Lu
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: