Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-24805

Compactor: Initiator shouldn't fetch table details again and again for partitioned tables

    XMLWordPrintableJSON

Details

    Description

      Initiator shouldn't be fetch table details for all its partitions. When there are large number of databases/tables, it takes lot of time for Initiator to complete its initial iteration and load on DB also goes higher.

      https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L129

      https://github.com/apache/hive/blob/64bb52316f19426ebea0087ee15e282cbde1d852/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L456

      For all the following partitions, table details would be the same. However, it ends up fetching table details from HMS again and again.

      2021-02-22 08:13:16,106 INFO  org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2451899
      2021-02-22 08:13:16,124 INFO  org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2451830
      2021-02-22 08:13:16,140 INFO  org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2452586
      2021-02-22 08:13:16,149 INFO  org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2452698
      2021-02-22 08:13:16,158 INFO  org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2452063
      

      Attachments

        Issue Links

          Activity

            People

              asinkovits Antal Sinkovits
              rajesh.balamohan Rajesh Balamohan
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4.5h
                  4.5h