On Fri, May 01, 2009 at 03:00:26PM -0700, Howard Chu wrote:
jwm@horde.net wrote:
Full_Name: John Morrissey Version: 2.4.16 OS: Linux URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (2001:4978:194:0:21f:5bff:fee9:da92)
After a couple days of uptime, slapd no longer responds to incoming connections (the connection would be accepted, but all LDAP operations would block). All worker threads seem to be blocking on mutex acquisition in bdb_cache_lru_link(). One thread was chewing lots of CPU.
Backtrace is below. I also have a ~1.7GB core if it's deemed useful; I'll keep it around for a week or two. This is with BDB 4.7.25+all three patches.
Interesting trace, it looks like all the active threads are waiting for the mutex but apparently none of them owns it. Can you please provide the contents of the mutex? e.g. thread 14 frame 3 print *mutex
(gdb) fra 3 #3 0xb7eec1cd in ldap_pvt_thread_mutex_lock (mutex=0x940a2cc) at /tmp/buildd/openldap-2.4.16/libraries/libldap_r/thr_posix.c:296 296 return ERRVAL( pthread_mutex_lock( mutex ) ); (gdb) print *mutex $1 = {__data = {__lock = 2, __count = 0, __owner = 6372, __kind = 0, __nusers = 1, {__spins = 0, __list = {__next = 0x0}}}, __size = "\002\000\000\000\000\000\000\000###30\000\000\000\000\000\000\001\000\000\000\000\000\000", __align = 2}
LWP 6372 is the thread trying to do BDB lock promotion.
john