Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-19329

Disallow some optimizations/behaviors for external tables

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      External tables in Hive are often used in situations where the data is being created and managed by other applications outside of Hive. There are several issues that can occur when data being written to table directories by external apps:

      • If an application is writing files to a table/partition at the same time that Hive tries to merge files for the same table/partition (ALTER TABLE CONCATENATE, or hive.merge.tezfiles during insert) data can be lost.
      • When new data has been added to the table by external applications, the Hive table statistics are often way out of date with the current state of the data. This can result in wrong results in the case of answering queries using stats, or bad query plans being generated.

      Some of these operations should be blocked in Hive. It looks like some already have been (HIVE-17403).

      Attachments

        Issue Links

          Activity

            People

              jdere Jason Dere
              jdere Jason Dere
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: