Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-4777

Zookeeper becomes unresponsive when using native GSSAPI

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.4.13, 3.4.14, 3.5.6, 3.5.7, 3.6.2, 3.7.1, 3.6.4, 3.7.2, 3.8.2, 3.8.3
    • None
    • kerberos, server
    • None
    • RHEL 7 and OpenJDK Runtime Environment (build 1.8.0_392-b08)

      RHEL 8 and OpenJDK Runtime Environment (Red_Hat-17.0.9.0.9-1) (build 17.0.9+9-LTS)

    Description

      Zookeeper ensemble starts up properly after quorum is made. The leader is elected and it starts serving requests. After a while the Leader gets stuck, so its just accepting requests but not processing it, same is the case with participants. They are accepting requests but since the leader doesn't process they keep piling up.

      This causes an issue with sudden increase on the no. of CLOSE_WAIT connections on the zookeeper servers. When this happens, the ensemble is completely unresponsive causing connection loss/timeouts. Once the CLOSE_WAIT start the number of open connections on each server spike as high as 100000 from a mere 200 connections within a few minutes.

      A pattern was found in thread dump where we always saw NIOServerCxnFactory selector thread blocked on a lock waiting in org.apache.zookeeper.server.ZooKeeperSaslServer.createSaslServer

      tdump_zkdev14.i.ia55.net_1694037623.logs-"NIOServerCxnFactory.SelectorThread-0" #16 daemon prio=5 os_prio=0 cpu=9126323.70ms elapsed=25935.16s tid=0x00007f9118702320 nid=0x20ed94 waiting for monitor entry  [0x00007f907e635000]
      tdump_zkdev14.i.ia55.net_1694037623.logs:   java.lang.Thread.State: BLOCKED (on object monitor)
      tdump_zkdev14.i.ia55.net_1694037623.logs-	at org.apache.zookeeper.server.ZooKeeperSaslServer.createSaslServer(ZooKeeperSaslServer.java:42)
      tdump_zkdev14.i.ia55.net_1694037623.logs-	- waiting to lock <0x0000000700391098> (a org.apache.zookeeper.Login)
      tdump_zkdev14.i.ia55.net_1694037623.logs-	at org.apache.zookeeper.server.ZooKeeperSaslServer.<init>(ZooKeeperSaslServer.java:38) 

      {{}}

      Seems to be related to https://issues.apache.org/jira/browse/ZOOKEEPER-2230

       

      Thanks

      Attachments

        Activity

          People

            Unassigned Unassigned
            rickeyski Rickey Visinski
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: