Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-14577

gc_grace_seconds should include UJ (Up/Joining) status period

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Normal
    • Resolution: Unresolved
    • None
    • Legacy/Coordination
    • None
    • Contrail 3.0.3.3-22/Cassandra 2.1.13

    Description

      Partial network connectivity (e.g. and MTU mismatch that blackholes jumbo frames) can cause a node to get stuck in a permanent UJ status (as reflected in nodetool).  It's possible the node can stay in this way for an extended period of time.  Once the isolated node rejoins due to a network repair, it can cause extensive data loss to the healthy nodes.

       

      If the node were completely isolated gc_grace_seconds would prevent the node from joining after the specified period.  Other corner cases besides "DN" should be covered if applicable.

       

      Reference: 

      JTAC 2018-0303-0029

      Attachments

        Activity

          People

            Unassigned Unassigned
            jk3064 John Knost
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: