Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-10496

Make DTCS/TWCS split partitions based on time during compaction

    XMLWordPrintableJSON

Details

    Description

      To avoid getting old data in new time windows with DTCS (or related, like TWCS), we need to split out old data into its own sstable during compaction.

      My initial idea is to just create two sstables, when we create the compaction task we state the start and end times for the window, and any data older than the window will be put in its own sstable.

      By creating a single sstable with old data, we will incrementally get the windows correct - say we have an sstable with these timestamps:

      [100, 99, 98, 97, 75, 50, 10]
      and we are compacting in window [100, 80] - we would create two sstables:
      [100, 99, 98, 97], [75, 50, 10], and the first window is now 'correct'. The next compaction would compact in window [80, 60] and create sstables [75], [50, 10] etc.

      We will probably also want to base the windows on the newest data in the sstables so that we actually have older data than the window.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              marcuse Marcus Eriksson
              Votes:
              9 Vote for this issue
              Watchers:
              29 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m