Kartik Subbarao wrote:
On 06/01/2011 02:02 PM, Howard Chu wrote:
Kartik Subbarao wrote:
On 06/01/2011 09:08 AM, Kartik Subbarao wrote:
Update: It's not a locks issue.
Another update -- after I reduced the idletimeout to 60 seconds, the problem seems to have gone away. It would still be useful to know what might be causing this problem and to be able to support higher levels of idletimeout, but at least I have another workable option now.
As usual, when investigating a hang, you should attach to slapd with gdb and get a snapshot of what all the threads are doing. Working around it before you know what caused it isn't all that useful in the long run.
Attached is the gdb stack trace from the hang state. It looks like the the threads are stuck in pthread_cond_wait() from send_ldap_ber(). Are there other relevant variables/structures to inspect for this scenario?
-Kartik
Also get a netstat -nA inet. Threads waiting in send_ldap_ber() means their output buffers got full, clients didn't read the pending data.