I've just tested with OpenLDAP 2.4.26, as you suggested, but the problem remains.
I've no load at all, I use mirror mode replication, and this time, the freeze occured when trying to modify a user's password: The modification is done from a usual Web interface from a Java application. The Java application is set to connect to OpenLDAP with a unique dedicated account, but in the case of a password change, we use the PROXYAUTH operation to do the modification with the user's rights, so that his pwdreset attribute is removed if previously set to TRUE
Below is the connection that brings the freeze. We never get an answer from operation 21:
Aug 4 13:43:28 rhvtq slapd-master1[23372]: conn=1201 fd=139 ACCEPT from IP=192.168.1.132:57374 (IP=192.168.1.132:1389) Aug 4 13:43:28 rhvtq slapd-master1[23372]: conn=1201 op=0 BIND dn="cn=yellow pages,ou=special users,dc=company" method=128 Aug 4 13:43:28 rhvtq slapd-master1[23372]: conn=1201 op=0 BIND dn="cn=yellow pages,ou=special users,dc=company" mech=SIMPLE ssf=0 Aug 4 13:43:28 rhvtq slapd-master1[23372]: conn=1201 op=0 RESULT tag=97 err=0 text= ... ... ... Aug 4 13:46:40 rhvtq slapd-master1[23372]: conn=1201 op=20 SEARCH RESULT tag=101 err=0 nentries=1 text= Aug 4 13:46:40 rhvtq slapd-master1[23372]: conn=1201 op=21 PROXYAUTHZ dn="employeeNumber=00296,ou=people,dc=company" Aug 4 13:46:40 rhvtq slapd-master1[23372]: conn=1201 op=21 MOD dn="employeeNumber=00296,ou=people,dc=company" Aug 4 13:46:40 rhvtq slapd-master1[23372]: conn=1201 op=21 MOD attr=userPassword
And here's the slapd pstack output :
Thread 18 (Thread 0x42018940 (LWP 23373)): #0 0x00000031a7ad4018 in epoll_wait () from /lib64/libc.so.6 #1 0x000000000041eb0b in ldap_pvt_sasl_mutex_dispose () #2 0x00000031a86064a7 in start_thread () from /lib64/libpthread.so.0 #3 0x00000031a7ad3c2d in clone () from /lib64/libc.so.6 Thread 17 (Thread 0x42819940 (LWP 23374)): #0 0x00000031a860ab99 in pthread_cond_wait@@GLIBC_2.3.2 () #1 0x00002afb0919af8d in ldap_int_thread_pool_wrapper () #2 0x00000031a86064a7 in start_thread () from /lib64/libpthread.so.0 #3 0x00000031a7ad3c2d in clone () from /lib64/libc.so.6 Thread 16 (Thread 0x4301a940 (LWP 23375)): #0 0x00000031a860ab99 in pthread_cond_wait@@GLIBC_2.3.2 () #1 0x00002afb0919af8d in ldap_int_thread_pool_wrapper () #2 0x00000031a86064a7 in start_thread () from /lib64/libpthread.so.0 #3 0x00000031a7ad3c2d in clone () from /lib64/libc.so.6 Thread 15 (Thread 0x4381b940 (LWP 23376)): #0 0x00000031a860ab99 in pthread_cond_wait@@GLIBC_2.3.2 () #1 0x00002afb0919af8d in ldap_int_thread_pool_wrapper () #2 0x00000031a86064a7 in start_thread () from /lib64/libpthread.so.0 #3 0x00000031a7ad3c2d in clone () from /lib64/libc.so.6 Thread 14 (Thread 0x4401c940 (LWP 23377)): #0 0x00000031a860ab99 in pthread_cond_wait@@GLIBC_2.3.2 () #1 0x00002afb0919af8d in ldap_int_thread_pool_wrapper () #2 0x00000031a86064a7 in start_thread () from /lib64/libpthread.so.0 #3 0x00000031a7ad3c2d in clone () from /lib64/libc.so.6 Thread 13 (Thread 0x4481d940 (LWP 23381)): #0 0x00000031a860ab99 in pthread_cond_wait@@GLIBC_2.3.2 () #1 0x00002afb09199e8b in ldap_pvt_thread_rmutex_lock () #2 0x00000000004d7fcf in ?? () #3 0x000000000048402a in ldap_pvt_sasl_mutex_dispose () #4 0x0000000000484607 in ldap_pvt_sasl_mutex_dispose () #5 0x0000000000477c2f in ldap_pvt_sasl_mutex_dispose () #6 0x000000000047f2eb in ldap_pvt_sasl_mutex_dispose () #7 0x0000000000422713 in ldap_pvt_sasl_mutex_dispose () #8 0x00002afb0919af38 in ldap_int_thread_pool_wrapper () #9 0x00000031a86064a7 in start_thread () from /lib64/libpthread.so.0 #10 0x00000031a7ad3c2d in clone () from /lib64/libc.so.6 Thread 12 (Thread 0x4501e940 (LWP 23382)): #0 0x00000031a860ab99 in pthread_cond_wait@@GLIBC_2.3.2 () #1 0x00002afb09622dd2 in __db_pthread_mutex_lock () #2 0x00002afb09622426 in __db_tas_mutex_lock () #3 0x00002afb096b2b49 in __lock_get_internal () #4 0x00002afb096b32c2 in __lock_get_pp () #5 0x00000000004b2c28 in ?? () #6 0x00000000004b3ab9 in ?? () #7 0x00000000004955ca in ldap_pvt_sasl_mutex_dispose () #8 0x00000000004840a2 in ldap_pvt_sasl_mutex_dispose () #9 0x0000000000484607 in ldap_pvt_sasl_mutex_dispose () #10 0x0000000000423a49 in ldap_pvt_sasl_mutex_dispose () #11 0x00000000004840a2 in ldap_pvt_sasl_mutex_dispose () #12 0x0000000000484607 in ldap_pvt_sasl_mutex_dispose () #13 0x0000000000424255 in ldap_pvt_sasl_mutex_dispose () #14 0x00000000004217e5 in ldap_pvt_sasl_mutex_dispose () #15 0x0000000000421dbf in ldap_pvt_sasl_mutex_dispose () #16 0x00002afb0919af38 in ldap_int_thread_pool_wrapper () #17 0x00000031a86064a7 in start_thread () from /lib64/libpthread.so.0 #18 0x00000031a7ad3c2d in clone () from /lib64/libc.so.6 Thread 11 (Thread 0x4581f940 (LWP 25216)): #0 0x00000031a860ab99 in pthread_cond_wait@@GLIBC_2.3.2 () #1 0x00002afb0919af8d in ldap_int_thread_pool_wrapper () #2 0x00000031a86064a7 in start_thread () from /lib64/libpthread.so.0 #3 0x00000031a7ad3c2d in clone () from /lib64/libc.so.6 Thread 10 (Thread 0x46020940 (LWP 25217)): #0 0x00000031a860ab99 in pthread_cond_wait@@GLIBC_2.3.2 () #1 0x00002afb09622dd2 in __db_pthread_mutex_lock () #2 0x00002afb09622426 in __db_tas_mutex_lock () #3 0x00002afb096b2b49 in __lock_get_internal () #4 0x00002afb096b32c2 in __lock_get_pp () #5 0x00000000004b2c28 in ?? () #6 0x00000000004b3ab9 in ?? () #7 0x00000000004b75e3 in ?? () #8 0x00000000004bb3ce in ?? () #9 0x0000000000483f5a in ldap_pvt_sasl_mutex_dispose () #10 0x0000000000484a67 in ldap_pvt_sasl_mutex_dispose () #11 0x00000000004e226b in ?? () #12 0x000000000048402a in ldap_pvt_sasl_mutex_dispose () #13 0x0000000000484607 in ldap_pvt_sasl_mutex_dispose () #14 0x000000000043d21e in ldap_pvt_sasl_mutex_dispose () #15 0x00000000004840a2 in ldap_pvt_sasl_mutex_dispose () #16 0x0000000000484607 in ldap_pvt_sasl_mutex_dispose () #17 0x000000000043dc1f in ldap_pvt_sasl_mutex_dispose () #18 0x00000000004217e5 in ldap_pvt_sasl_mutex_dispose () #19 0x0000000000421dbf in ldap_pvt_sasl_mutex_dispose () #20 0x00002afb0919af38 in ldap_int_thread_pool_wrapper () #21 0x00000031a86064a7 in start_thread () from /lib64/libpthread.so.0 #22 0x00000031a7ad3c2d in clone () from /lib64/libc.so.6 Thread 9 (Thread 0x46821940 (LWP 25218)): #0 0x00000031a860ab99 in pthread_cond_wait@@GLIBC_2.3.2 () #1 0x00002afb09622dd2 in __db_pthread_mutex_lock () #2 0x00002afb09622426 in __db_tas_mutex_lock () #3 0x00002afb096b2b49 in __lock_get_internal () #4 0x00002afb096b34ea in __lock_vec () #5 0x00002afb096b412b in __lock_vec_pp () #6 0x00000000004b3264 in ?? () #7 0x00000000004b3650 in ?? () #8 0x000000000048fe13 in ldap_pvt_sasl_mutex_dispose () #9 0x00000000004840a2 in ldap_pvt_sasl_mutex_dispose () #10 0x0000000000484607 in ldap_pvt_sasl_mutex_dispose () #11 0x0000000000438847 in ldap_pvt_sasl_mutex_dispose () #12 0x00000000004840a2 in ldap_pvt_sasl_mutex_dispose () #13 0x0000000000484607 in ldap_pvt_sasl_mutex_dispose () #14 0x0000000000438fb2 in ldap_pvt_sasl_mutex_dispose () #15 0x00000000004217e5 in ldap_pvt_sasl_mutex_dispose () #16 0x0000000000421dbf in ldap_pvt_sasl_mutex_dispose () #17 0x00002afb0919af38 in ldap_int_thread_pool_wrapper () #18 0x00000031a86064a7 in start_thread () from /lib64/libpthread.so.0 #19 0x00000031a7ad3c2d in clone () from /lib64/libc.so.6 Thread 8 (Thread 0x47022940 (LWP 25219)): #0 0x00000031a860ab99 in pthread_cond_wait@@GLIBC_2.3.2 () #1 0x00002afb0919af8d in ldap_int_thread_pool_wrapper () #2 0x00000031a86064a7 in start_thread () from /lib64/libpthread.so.0 #3 0x00000031a7ad3c2d in clone () from /lib64/libc.so.6 Thread 7 (Thread 0x47823940 (LWP 25220)): #0 0x00000031a860ab99 in pthread_cond_wait@@GLIBC_2.3.2 () #1 0x00002afb0919af8d in ldap_int_thread_pool_wrapper () #2 0x00000031a86064a7 in start_thread () from /lib64/libpthread.so.0 #3 0x00000031a7ad3c2d in clone () from /lib64/libc.so.6 Thread 6 (Thread 0x48024940 (LWP 25221)): #0 0x00000031a860ab99 in pthread_cond_wait@@GLIBC_2.3.2 () #1 0x00002afb0919af8d in ldap_int_thread_pool_wrapper () #2 0x00000031a86064a7 in start_thread () from /lib64/libpthread.so.0 #3 0x00000031a7ad3c2d in clone () from /lib64/libc.so.6 Thread 5 (Thread 0x48825940 (LWP 25222)): #0 0x00000031a860ab99 in pthread_cond_wait@@GLIBC_2.3.2 () #1 0x00002afb0919af8d in ldap_int_thread_pool_wrapper () #2 0x00000031a86064a7 in start_thread () from /lib64/libpthread.so.0 #3 0x00000031a7ad3c2d in clone () from /lib64/libc.so.6 Thread 4 (Thread 0x49026940 (LWP 25223)): #0 0x00000031a860ab99 in pthread_cond_wait@@GLIBC_2.3.2 () #1 0x00002afb0919af8d in ldap_int_thread_pool_wrapper () #2 0x00000031a86064a7 in start_thread () from /lib64/libpthread.so.0 #3 0x00000031a7ad3c2d in clone () from /lib64/libc.so.6 Thread 3 (Thread 0x49827940 (LWP 25224)): #0 0x00000031a860ab99 in pthread_cond_wait@@GLIBC_2.3.2 () #1 0x00002afb0919af8d in ldap_int_thread_pool_wrapper () #2 0x00000031a86064a7 in start_thread () from /lib64/libpthread.so.0 #3 0x00000031a7ad3c2d in clone () from /lib64/libc.so.6 Thread 2 (Thread 0x4a028940 (LWP 25225)): #0 0x00000031a860ab99 in pthread_cond_wait@@GLIBC_2.3.2 () #1 0x00002afb09199e8b in ldap_pvt_thread_rmutex_lock () #2 0x00000000004d7fcf in ?? () #3 0x000000000048402a in ldap_pvt_sasl_mutex_dispose () #4 0x0000000000484607 in ldap_pvt_sasl_mutex_dispose () #5 0x00000000004e1a08 in ?? () #6 0x0000000000430285 in ldap_pvt_sasl_mutex_dispose () #7 0x0000000000430cc8 in ldap_pvt_sasl_mutex_dispose () #8 0x0000000000431a49 in ldap_pvt_sasl_mutex_dispose () #9 0x000000000043ce10 in ldap_pvt_sasl_mutex_dispose () #10 0x000000000043d4b4 in ldap_pvt_sasl_mutex_dispose () #11 0x00000000004840a2 in ldap_pvt_sasl_mutex_dispose () #12 0x0000000000484607 in ldap_pvt_sasl_mutex_dispose () #13 0x000000000043dc1f in ldap_pvt_sasl_mutex_dispose () #14 0x00000000004217e5 in ldap_pvt_sasl_mutex_dispose () #15 0x0000000000421dbf in ldap_pvt_sasl_mutex_dispose () #16 0x00002afb0919af38 in ldap_int_thread_pool_wrapper () #17 0x00000031a86064a7 in start_thread () from /lib64/libpthread.so.0 #18 0x00000031a7ad3c2d in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x2afb0996fbd0 (LWP 23372)): #0 0x00000031a86077e5 in pthread_join () from /lib64/libpthread.so.0 #1 0x000000000041c16d in ldap_pvt_sasl_mutex_dispose () #2 0x000000000040997b in ldap_pvt_sasl_mutex_dispose () #3 0x00000031a7a1d994 in __libc_start_main () from /lib64/libc.so.6 #4 0x0000000000408389 in ldap_pvt_sasl_mutex_dispose () #5 0x00007fffd4f51358 in ?? () #6 0x0000000000000000 in ?? ()
I've also opened a case with Red Hat, they suggested me to test under RHEL 6, with their OpenLDAP rpm package (2.4.23-15.el6). So I'll do it right now but if you need more details on the RHEL 5.4 hung slapd process, I can send them since it's still frozen.
Quanah Gibson-Mount a écrit :
--On Tuesday, August 02, 2011 6:28 PM +0200 Cyril GROSJEAN cgrosjean@janua.fr wrote:
Any idea of what's wrong ? Known bug ?
Can you try with 2.4.26? There was an issue with 2.4.23 that may be the culprit.
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc.
Zimbra :: the leader in open source messaging and collaboration