Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-18773

Compactions are slow

    XMLWordPrintableJSON

Details

    Description

      I have noticed that compactions involving a lot of sstables are very slow (for example major compactions). I have attached a cassandra stress profile that can generate such a dataset under ccm. In my local test I have 2567 sstables at 4Mb each.

      I added code to track wall clock time of various parts of the code. One problematic part is ManyToOne constructor. Tracing through the code for every partition creating a ManyToOne for all the sstable iterators for each partition. In my local test get a measy 60Kb/sec read speed, and bottlenecked on single core CPU (since this code is single threaded) with it spending 85% of the wall clock time in ManyToOne constructor.

      As another datapoint to show its the merge iterator part of the code using the cfstats from https://github.com/instaclustr/cassandra-sstable-tools/ which reads all the sstables but does no merging gets 26Mb/sec read speed.

      Tracking back from ManyToOne call I see this in UnfilteredPartitionIterators::merge

                      for (int i = 0; i < toMerge.size(); i++)
                      {
                          if (toMerge.get(i) == null)
                          {
                              if (null == empty)
                                  empty = EmptyIterators.unfilteredRow(metadata, partitionKey, isReverseOrder);
                              toMerge.set(i, empty);
                          }
                      }
       

      Not sure what purpose of creating these empty rows are. But on a whim I removed all these empty iterators before passing to ManyToOne and then all the wall clock time shifted to CompactionIterator::hasNext() and read speed increased to 1.5Mb/s.

      So there are further bottlenecks in this code path it seems, but the first is this ManyToOne and having to build it for every partition read.

      Attachments

        1. 18773.patch
          5 kB
          Cameron Zemek
        2. compact-poc.patch
          13 kB
          Cameron Zemek
        3. flamegraph.png
          163 kB
          Cameron Zemek
        4. stress.yaml
          0.6 kB
          Cameron Zemek

        Activity

          People

            cam1982 Cameron Zemek
            cam1982 Cameron Zemek
            Cameron Zemek
            Branimir Lambov, Stefan Miklosovic
            Votes:
            1 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 3h 40m
                3h 40m