Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-12966

Gossip thread slows down when using batch commit log

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Low
    • Resolution: Fixed
    • 3.0.15, 3.11.1, 4.0-alpha1, 4.0
    • None
    • None

    Description

      When using batch commit log mode, the Gossip thread slows down when peers after a node bounces. This is because we perform a bunch of updates to the peers table via SystemKeyspace.updatePeerInfo, which is a synchronized method. How quickly each one of those individual updates takes depends on how busy the system is at the time wrt write traffic. If the system is largely quiescent, each update will be relatively quick (just waiting for the fsync). If the system is getting a lot of writes, and depending on the commitlog_sync_batch_window_in_ms, each of the Gossip thread's updates can get stuck in the backlog, which causes the Gossip thread to stop processing. We have observed in large clusters that a rolling restart causes triggers and exacerbates this behavior.

      Attachments

        Activity

          People

            jasobrown Jason Brown
            jasobrown Jason Brown
            Jason Brown
            Stefan Podkowinski
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: