Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-26150

OrcRawRecordMerger reads each row twice

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 4.0.0-alpha-2
    • None
    • ORC, Transactions
    • None

    Description

      OrcRawRecordMerger reads each row twice, the issue does not surface since the merger is only used with the parameter "collapseEvents" as true, which filters out one of the two rows.

      collapseEvents true and false should produce the same result, since in current acid implementation, each event has a distinct rowid, so two identical rows cannot be there, this is the case only for the bug.

      In order to reproduce the issue, it is sufficient to set the second parameter to false here, and run tests in TestOrcRawRecordMerger and observe two tests failing:

      mvn test -Dtest=TestOrcRawRecordMerger -pl ql
      
      [INFO] Results:
      [INFO]
      [ERROR] Failures:
      [ERROR]   TestOrcRawRecordMerger.testRecordReaderNewBaseAndDelta:1332 Found unexpected row: (0,ignore.1)
      [ERROR]   TestOrcRawRecordMerger.testRecordReaderOldBaseAndDelta:1208 Found unexpected row: (0,ignore.1)
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              asolimando Alessandro Solimando
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: