Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2704

Rowsets that are much bigger than the target size discourage compactions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 1.9.0
    • 1.9.0
    • None
    • None

    Description

      In KUDU-2701, I fixed a KUDU-1400-related compaction loop where the size used for compaction was the base data and redos, which caused situations where compacting rowsets that looked small but weren't was effectively a no-op, resulting in a compaction loop. Now, rowset count / KUDU-1400 compactions use the whole rowset size. While testing something on a table with 279 columns, I noticed that almost all rowsets were being flushed at a size of 80-90MB and, even though the tablet height was increasing rapidly and above 20, almost no compactions were happening. Looking into it, when the total size of the rowset is far above the target size, we assign a big negative score to including the rowset in a compaction, since the score is proportional to 1 - size/target size. This problem always existed, it just got worse because the size now includes more things.

      Attachments

        Activity

          People

            wdberkeley William Berkeley
            wdberkeley William Berkeley
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: