Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-26562

HMS partitions quota

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Metastore
    • None

    Description

      The Hive service always suffered on all the versions for the number of partitions that affected the performances, the resources used, and the JDK limit for the array size when it was trying to write the thrift answers.
      A 'partition quota' on the HMS can help to prevent a number of issues, at least avoiding dealing with them when it is already too late to restructure the schema of the tables (because of the number of jobs built around it, clients, and so on).

      We have the hive.limit.query.max.table.partition and hive.metastore.limit.partition.request (HIVE-13884/HIVE-23556), or hive.exec.max.dynamic.partitions (that's ok for a single execution, but we can't really limit the partitions in case of sequential dynamic inserts).

      On HDFS we have a quota for files, but not for directories.

      I would propose the Hive team evaluate the idea to have an upper bound directly on the HMS that can prevent a table to have the partitions growing indefinitely (e.g.: in case this limit is hit to abort/fail the operation). Maybe it'll not solve all the issues, but most likely will be of help.

      Attachments

        Activity

          People

            Unassigned Unassigned
            adrenas Adriano
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: