Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-3537

Leader election - Use of out of election messages

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Trivial
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.6.0
    • Component/s: None

      Description

      Hello ZooKeeper developers,

      in lookForLeader in FastLeaderElection there is the following switch block in case a notification message n is received where n.state is either FOLLOWING or LEADING (https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/FastLeaderElection.java#L1029).

      case FOLLOWING:
      case LEADING:
        /*
         * Consider all notifications from the same epoch
         * together.
         */
        if (n.electionEpoch == logicalclock.get()) {
          recvset.put(n.sid, new Vote(n.leader, n.zxid, n.electionEpoch, n.peerEpoch));
          voteSet = getVoteTracker(recvset, new Vote(n.version, n.leader, n.zxid, n.electionEpoch, n.peerEpoch, n.state));
          if (voteSet.hasAllQuorums() && checkLeader(outofelection, n.leader, n.electionEpoch)) {
            setPeerState(n.leader, voteSet);
            Vote endVote = new Vote(n.leader, n.zxid, n.electionEpoch, n.peerEpoch);
            leaveInstance(endVote);
            return endVote;
          }
        }
      
        /*
         * Before joining an established ensemble, verify that
         * a majority are following the same leader.
         */
        outofelection.put(n.sid, new Vote(n.version, n.leader, n.zxid, n.electionEpoch, n.peerEpoch, n.state));
        voteSet = getVoteTracker(outofelection, new Vote(n.version, n.leader, n.zxid, n.electionEpoch, n.peerEpoch, n.state));
      
        if (voteSet.hasAllQuorums() && checkLeader(outofelection, n.leader, n.electionEpoch)) {
          synchronized (this) {
            logicalclock.set(n.electionEpoch);
            setPeerState(n.leader, voteSet);
          }
          Vote endVote = new Vote(n.leader, n.zxid, n.electionEpoch, n.peerEpoch);
          leaveInstance(endVote);
          return endVote;
        }
        break;

       

      We notice that when n.electionEpoch == logicalclock.get(), votes are being added into recvset, however checkLeader is called immediately afterwards with the votes in outofelection as can be seen here (https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/FastLeaderElection.java#L1037).

      Checking outofelection instead of recvset does not cause any problems.
      If checkLeader on outofelection fails, although it would have succeeded in recvset, checkLeader succeeds immediately afterwards when the vote is added in outofelection.
      Still, it seems natural to check for a leader in recvSet and not in outofelection

      Cheers,
      Karolos

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                karolos Karolos Antoniadis
                Reporter:
                karolos Karolos Antoniadis
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 50m
                  1h 50m