Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-2836

QuorumCnxManager.Listener Thread Better handling of SocketTimeoutException

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 3.4.6
    • 3.7.0
    • leaderElection, quorum
    • Machine: Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.78-1 x86_64 GNU/Linux
      Java Version: jdk64/jdk1.8.0_40
      zookeeper version: 3.4.6.2.3.2.0-2950

    Description

      QuorumCnxManager Listener thread blocks SocketServer on accept but we are getting SocketTimeoutException on our boxes after 49days 17 hours . As per current code there is a 3 times retry and after that it says "As I'm leaving the listener thread, I won't be able to participate in leader election any longer: $<hostname>/$<ip>:3888_" , Once server nodes reache this state and we restart or add a new node ,it fails to join cluster and logs 'WARN QuorumPeer<myid=1>/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@383 - Cannot open channel to 3 at election address $<hostname>/$<ip>:3888' .

      As there is no timeout specified for ServerSocket it should never timeout but there are some already discussed issues where people have seen this issue and added checks for SocketTimeoutException explicitly like https://issues.apache.org/jira/browse/KARAF-3325 .

      I think we need to handle SocketTimeoutException on similar lines for zookeeper as well

      Attachments

        Activity

          People

            gaoshu gaoshu
            amarjeet_singh Amarjeet Singh
            Votes:
            1 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 0.5h
                0.5h