Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20580

OrcInputFormat.isOriginal() should not rely on hive.acid.key.index

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.1.0
    • 4.0.0-alpha-1
    • Transactions
    • None

    Description

      org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.isOriginal() is checking for presence of hive.acid.key.index in the footer. This is only created when the file is written by OrcRecordUpdater. It should instead check for presence of Acid metadata columns so that a file can be produced by something other than OrcRecordUpdater.

      Also, hive.acid.key.index counts number of different type of events which is not really useful for Acid V2 (as of Hive 3) since each file only has 1 type of event.

      Attachments

        1. HIVE-20580.patch
          9 kB
          Peter Vary
        2. HIVE-20580.2.patch
          12 kB
          Peter Vary
        3. HIVE-20580.3.patch
          12 kB
          Peter Vary
        4. HIVE-20580.4.patch
          13 kB
          Peter Vary
        5. HIVE-20580.5.patch
          13 kB
          Peter Vary
        6. HIVE-20580.6.patch
          13 kB
          Peter Vary

        Issue Links

          Activity

            People

              pvary Peter Vary
              ekoifman Eugene Koifman
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: