Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-27224

Enhance drop table/partition command

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Hive, Standalone Metastore
    • None

    Description

      Problem Statement:

      If the table has a large number of partitions, then drop table command will take a lot of time to finish. To improve the command we have the following proposals 

      • Perform all the queries(HMS->DB) in drop table in batches(not just partitions table) so that query will not fail throwing exceptions like transaction id not found or any other timeout issues as this is directly proportional to backend database performance
      • Display what action is happening as part of drop table, so that user will know what step is taking more time or how many steps completed so far. we should have loggers(DEBUG's at least) in clients to know how many partitions/batches being processed & current iterations to estimate approx. timeout for such large HMS operation.
        • It would be great to add time taken for each HMS API call which implies response time from backend database
      • support retry option, if for some reason drop table command fails performing some of the operations, the next time it is run, it should proceed with next operations instead of failing due to missing/stale entries

      Attachments

        Activity

          People

            Unassigned Unassigned
            tarak271 Taraka Rama Rao Lethavadla
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: