Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-22284

Improve LLAP CacheContentsTracker to collect and display correct statistics

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 4.0.0-alpha-1
    • llap
    • None

    Description

      When keeping track of which buffers correspond to what Hive objects, CacheContentsTracker relies on cache tags.

      Currently a tag is a simple String that ideally holds DB and table name, and a partition spec concatenated by . and / . The information here is derived from the Path of the file that is getting cached. Needless to say sometimes this produces a wrong tag especially for external tables.

      Also there's a bug when calculating aggregated stats for a 'parent' tag (corresponding to the table of the partition) because the overall maxCount and maxSize do not add up to the sum of those in the partitions. This happens when buffers get removed from the cache.

       

      Attachments

        1. HIVE-22284.0.patch
          57 kB
          Ádám Szita
        2. HIVE-22284.1.patch
          67 kB
          Ádám Szita
        3. HIVE-22284.2.patch
          67 kB
          Ádám Szita
        4. HIVE-22284.3.patch
          71 kB
          Ádám Szita
        5. HIVE-22284.4.patch
          71 kB
          Ádám Szita
        6. HIVE-22284.5.patch
          71 kB
          Ádám Szita
        7. HIVE-22284.6.patch
          71 kB
          Ádám Szita
        8. HIVE-22284.7.patch
          71 kB
          Ádám Szita

        Issue Links

          Activity

            People

              szita Ádám Szita
              szita Ádám Szita
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: