Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-23050

Partition pruning cache miss during compilation

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 4.0.0
    • None
    • Query Planning
    • None

    Description

      create table pcr_t1 (key int, value string) partitioned by (ds string);
      
      insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src where key < 20 order by key;
      insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src where key < 20 order by key;
      insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src where key < 20 order by key;
      
      explain extended select key, value, ds from pcr_t1 where (ds='2000-04-08' and key=1) or (ds='2000-04-09' and key=2) order by key, value, ds
      

      During query compilation HivePartitionPruner fetches list of partition and caches it, later PCR (partition condition removal) tries to get pruned partitions but due to cache miss, request goes to metastore server to retrieve pruned partitions using listPartitions.

      Improvement here would be to use the list of partitions already cached to do the partition pruning for PCR or pruning in general
      (I am not sure why HivePartitionPruner isn't able to do partition pruning in the first place)

      Attachments

        Activity

          People

            vgarg Vineet Garg
            vgarg Vineet Garg
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: