Description
If a peer is partitioned from the current leader its election timer will fire and it will trigger an election. However this doesn't mean that the leader is not well and working and in touch with a majority of peers, it just means that the connection between the peer and the leader is down.
In this case we might fall into a cycle where the peer triggers and election, increases the term of some of the other peers, until eventually they reject messages from the current leader. The peer's election will be likely lost, since the leader has a more up-to-date log, so it's possible that the leader gets elected again, in which case this process is repeated.
If the leader is being hammered enough, meaning it has a more up-to-date log than the majority this can go on forever.
We should make a peer ask a majority if they currently have a leader before triggering leader election.
Attachments
Issue Links
- relates to
-
KUDU-2947 A replica with slow WAL may grant votes even if established leader is alive and well
- Resolved