Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-3362

Estimate compaction based on diff to previous compacted head state

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Minor
    • Resolution: Won't Fix
    • None
    • None
    • segment-tar

    Description

      Food for thought: try to base the compaction estimation on a diff between the latest compacted state and the current state.

      Pros

      • estimation duration would be proportional to number of changes on the current head state
      • using the size on disk as a reference, we could actually stop the estimation early when we go over the gc threshold.
      • data collected during this diff could in theory be passed as input to the compactor so it could focus on compacting a specific subtree

      Cons

      • need to keep a reference to a previous compacted state. post-startup and pre-compaction this might prove difficult (except maybe if we only persist the revision similar to what the async indexer is doing currently)
      • coming up with a threshold for running compaction might prove difficult
      • diff might be costly, but still cheaper than the current full diff

      Attachments

        Issue Links

          Activity

            People

              stillalex Alex Deparvu
              stillalex Alex Deparvu
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: