Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-22705

LLAP cache is polluted by query-based compactor

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 4.0.0-alpha-1
    • None
    • None

    Description

      One of the steps that query-based compaction does is the verification of ACID sort order by using the validate_acid_sort_order UDF. This is a prerequisite before the actual compaction can happen, and is done by a query that reads the whole table content.

      This results in the whole table content being populated into the cache. The problem is that this content is not useful and will rather pollute the cache space, as it can never be used again: cache content binds to files (file IDs) that obviously will be changed in this case by compaction.

      I propose we disable LLAP caching in the session of query-based compaction's queries.

      Attachments

        1. HIVE-22705.2.patch
          6 kB
          Ádám Szita
        2. HIVE-22705.1.patch
          6 kB
          Ádám Szita
        3. HIVE-22705.0.patch
          6 kB
          Ádám Szita

        Activity

          People

            szita Ádám Szita
            szita Ádám Szita
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: