Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-2982

Re-try DNS hostname -> IP resolution

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersStop watchingWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 3.5.0, 3.5.1, 3.5.3
    • 3.5.4, 3.6.0
    • server
    • None

    Description

      ZOOKEEPER-1506 fixed a DNS resolution issue in 3.4. Some portions of the fix haven't yet been ported to 3.5.

      To recap the outstanding problem in 3.5, if a given ZK server is started before all peer addresses are resolvable, that server may cache a negative lookup result and forever fail to resolve the address. For example, deploying ZK 3.5 to Kubernetes using a StatefulSet plus a Service (headless) may fail because the DNS records are created lazily.

      2018-02-18 09:11:22,583 [myid:0] - WARN  [QuorumPeer[myid=0](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Follower@95] - Exception when following the leader
      java.net.UnknownHostException: zk-2.zk.default.svc.cluster.local
              at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
              at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
              at java.net.Socket.connect(Socket.java:589)
              at org.apache.zookeeper.server.quorum.Learner.sockConnect(Learner.java:227)
              at org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:256)
              at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:76)
              at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1133)
      

      In the above example, the address `zk-2.zk.default.svc.cluster.local` was not resolvable when the server started, but became resolvable shortly thereafter. The server should eventually succeed but doesn't.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            fpj Flavio Paiva Junqueira Assign to me
            eronwright Eron Wright
            Votes:
            0 Vote for this issue
            Watchers:
            6 Stop watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment