Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-4871

Kafka doesn't respect TTL on Zookeeper hostname - crash if zookeeper IP changes

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 0.10.2.0
    • None
    • None
    • None

    Description

      I had a Zookeeper cluster that automatically obtains hostname so that they remain constant over time. I deleted my 3 zookeeper machines and new machines came back online, with the same hostname, and they updated their CNAME

      Kafka then failed and couldn't reconnect to Zookeeper as it didn't try to resolve the IP of Zookeeper again. See log below:

      [2017-03-09 05:49:57,302] INFO Client will use GSSAPI as SASL mechanism. (org.apache.zookeeper.client.ZooKeeperSaslClient)
      [2017-03-09 05:49:57,302] INFO Opening socket connection to server zookeeper-3.example.com/10.12.79.43:2181. Will attempt to SASL-authenticate using Login Context section 'Client' (org.apache.zookeeper.ClientCnxn)

      [ec2-user]$ dig +short zookeeper-3.example.com
      10.12.79.36

      As you can see even though the machine is capable of finding the new hostname, Kafka somehow didn't respect the TTL (was set to 60 seconds) and didn't get the new IP. I feel that on failed Zookeeper connection, Kafka should at least try to resolve the new Zookeeper IP. That allows Kafka to keep up with Zookeeper changes over time

      What do you think? Is that expected behaviour or a bug?

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              stephane.maarek@gmail.com Stephane Maarek
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: