Richard Silverman wrote:
However, this is somehow ineffective, because shortly thereafter all the worker threads block with the following stack:
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x0000000000568f9a in ldap_int_thread_cond_wait (cond=0x7fe190fb9030, mutex=0x7fe190fb8fe0) at thr_posix.c:277 #2 0x000000000056aeee in ldap_pvt_thread_cond_wait (cond=0x7fe190fb9030, mutex=0x7fe190fb8fe0) at thr_debug.c:954 #3 0x000000000043e1b7 in send_ldap_ber (op=0x7fe160129fc0, ber=0x7fe16ae7b460) at result.c:303 #4 0x0000000000441338 in slap_send_search_entry (op=0x7fe160129fc0, rs=0x7fe16affc920) at result.c:1429 #5 0x00000000004c5953 in bdb_search (op=0x7fe160129fc0, rs=0x7fe16affc920) at search.c:1014 #6 0x000000000042daae in fe_op_search (op=0x7fe160129fc0, rs=0x7fe16affc920) at search.c:402 #7 0x000000000042d2f2 in do_search (op=0x7fe160129fc0, rs=0x7fe16affc920) at search.c:247 #8 0x000000000042a4b7 in connection_operation (ctx=0x7fe16affca00, arg_v=0x7fe160129fc0) at connection.c:1150 #9 0x00000000005679f3 in ldap_int_thread_pool_wrapper (xpool=0x1008e90) at tpool.c:688 #10 0x000000000056a493 in ldap_debug_thread_wrapper (arg=0x7fe1740015c0) at thr_debug.c:770 #11 0x00000039e3407851 in start_thread (arg=0x7fe16affd700) at pthread_create.c:301 #12 0x00000039e2ce811d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
(This is with thread debugging enabled.) They are blocked here, again in send_ldap_ber():
while ( conn->c_writers > 0 && conn->c_writing ) { ldap_pvt_thread_cond_wait( &conn->c_write1_cv, &conn->c_write1_mutex ); }
The problem is that they are *all* here; there is no one left to signal that condition variable, so slapd locks up forever.
Please post the entire stack trace for all threads. Since we have c_writers = 8, c_writing = 1, c_writewaiter = 1
then one of the threads ought to be blocked on the write2_cv (result.c:373). That implies that they're still waiting for epoll() to say that the socket is writable.