Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20143

analyze doesn't mark partition column stats as accurate after truncate

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Statistics
    • None

    Description

      Discovered while looking at txn stats. This applies for non-txn tables; works fine for truncate+analyze for non-partitioned tables, but not for partitions:

      set hive.stats.dbclass=fs;
      set hive.stats.fetch.column.stats=true;
      set hive.stats.autogather=true;
      set hive.stats.column.autogather=true;
      set hive.compute.query.using.stats=true;
      set hive.mapred.mode=nonstrict;
      set hive.explain.user=false;
      set hive.fetch.task.conversion=none;
      set hive.query.results.cache.enabled=false;
      
      create table stats_part1(key int,value string) partitioned by (p int);
      insert into table stats_part1 partition(p=101) values (1, "foo");
      insert into table stats_part1 partition(p=102) values (2, "bar");
      explain select count(key) from stats_part1; -- from stats
      
      truncate table stats_part1 partition(p=101);
      explain select count(key) from stats_part1; -- not from stats, ok
      
      analyze table stats_part1 partition(p) compute statistics for columns;
      explain select count(key) from stats_part1; -- not from stats still
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              sershe Sergey Shelukhin
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: