Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Cannot Reproduce
-
1.2
-
Ubuntu 8.04, Erlang 14.b.4 64bit
-
Committers Level (Medium to Hard)
Description
When replicating using pull replication from an HTTPS-CouchDB source, the client socket does not go away, but stays in CLOSE_WAIT forever, This will crash the whole CouchDB server, as it will run out of file descriptors.
This did not happen with CouchDB 1.1.
I experimented with changing the socket options for the replicator client, though no luck. The only change i saw was then running with keepalive (which was the default), also the server side (pull peer) leaks a connection. Now i am running with socket_options = [
{keepalive, false},
{send_timeout, 10000},
{send_timeout_close, true}]
which does not change a thing other than on the client side is leaking connections.
To test this, you need the PID of the couchdb's beam process (ps aux |grep beam)
Then you list all the open files of this PID with "lsof -p $PID"
First you will see the pull connections beeing in ESTABLISHED state for a wile (even when the replication itself is long finished), Then at some point it switches to CLOSE_WAIT. The client side socket needs to be closed by the replicator to go away and release the resources (eg. file pointer).