Full_Name: Ralf Haferkamp Version: 2.4.25/HEAD OS: URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (89.166.172.234) Submitted by: ralf
Under certain circumstances slapo-pcache causes a deadlock in slapd when refreshing a cached query. The reason seems to be that, slapo-pcache considers the refresh search, issued from the consitency checker task, to be answerable from the local cache database (this only seems to happen when the pcacheAttrset contains the "objectclass" Attribute). slapo-pcache will search the cache db with the response callback pointing to be_modify() (indirectly through refresh_merge()) of the same database. When using bdb/hdb as the cache database this cause the task request a write lock on the cached entry with in one bdb transaction while already holding a read lock (from the internal search) through another TXN. slapd is deadlocked after that. See below for a gdb backtrace taken from that situation.
The fix seems to be fairly easy. Just do not lookup the cache db when refreshing a cached query. It doesn't make sense anyway. I'll commit something for that shortly.
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007f2b88eaf519 in __db_pthread_mutex_lock (env=0xa5ebb0, mutex=<value optimized out>) at ../mutex/mut_pthread.c:318 #2 0x00007f2b88f4252e in __lock_get_internal (lt=0xa5efc0, sh_locker=<value optimized out>, flags=0, obj=0x14060, lock_mode=<value optimized out>, timeout=2169475480, lock=0x7f2b814f93a8) at ../lock/lock.c:953 #3 0x00007f2b88f4350c in __lock_vec (env=0xa5ebb0, sh_locker=0x7f2b858acf38, flags=0, list=0x7f2b814f9360, nlist=2, elistp=0x0) at ../lock/lock.c:136 #4 0x00007f2b88f43c48 in __lock_vec_api (dbenv=<value optimized out>, lid=2147483672, flags=0, list=0x7f2b814f9360, nlist=2, elistp=0x0) at ../lock/lock.c:84 #5 __lock_vec_pp (dbenv=<value optimized out>, lid=2147483672, flags=0, list=0x7f2b814f9360, nlist=2, elistp=0x0) at ../lock/lock.c:66 #6 0x000000000052a5d3 in bdb_cache_entry_db_relock (bdb=0x9c7a20, txn=0xa66670, ei=0xd71720, rw=1, tryOnly=0, lock=0x7f2b814f94d0) at ../../../../servers/slapd/back-bdb/cache.c:198 #7 0x000000000052c1e7 in bdb_cache_modify (bdb=0x9c7a20, e=0x7f2b85445068, newAttrs=0x7f2b8267e258, txn=0xa66670, lock=0x7f2b814f94d0) at ../../../../servers/slapd/back-bdb/cache.c:1231 #8 0x00000000004f3f2d in bdb_modify (op=0x7f2b8167a430, rs=0x7f2b814f9820) at ../../../../servers/slapd/back-bdb/modify.c:711 #9 0x000000000059e458 in refresh_merge (op=0x7f2b8167a430, rs=0x7f2b8167a370) at ../../../../servers/slapd/overlays/pcache.c:3275 #10 0x000000000045c207 in slap_response_play (op=0x7f2b8167a430, rs=0x7f2b8167a370) at ../../../servers/slapd/result.c:505 #11 0x000000000045dd28 in slap_send_search_entry (op=0x7f2b8167a430, rs=0x7f2b8167a370) at ../../../servers/slapd/result.c:997 #12 0x00000000004fbfd0 in bdb_search (op=0x7f2b8167a430, rs=0x7f2b8167a370) ---Type <return> to continue, or q <return> to quit--- at ../../../../servers/slapd/back-bdb/search.c:962 #13 0x000000000059d86e in pcache_op_search (op=0x7f2b8167a430, rs=0x7f2b8167a370) at ../../../../servers/slapd/overlays/pcache.c:3037 #14 0x00000000004d9ba5 in overlay_op_walk (op=0x7f2b8167a430, rs=0x7f2b8167a370, which=op_search, oi=0x9c7430, on=0x9c7610) at ../../../servers/slapd/backover.c:661 #15 0x00000000004d9e52 in over_op_func (op=0x7f2b8167a430, rs=0x7f2b8167a370, which=op_search) at ../../../servers/slapd/backover.c:723 #16 0x00000000004d9f3a in over_op_search (op=0x7f2b8167a430, rs=0x7f2b8167a370) at ../../../servers/slapd/backover.c:750 #17 0x000000000059e9bb in refresh_query (op=0x7f2b8167a430, query=0xa66580, on=0x9c7610) at ../../../../servers/slapd/overlays/pcache.c:3371 #18 0x000000000059f03d in consistency_check (ctx=0x7f2b8167ab60, arg=0xa61910) at ../../../../servers/slapd/overlays/pcache.c:3492