toby@inf.ed.ac.uk wrote:
OK, this is my first crash since moving to 2.4.16:
Slightly different from the previous ones, but is coming through pcache.c:remove_from_template and, according to slapd log, was removing queries from cache when crash occurred:
Apr 9 05:05:18 rockingham slapd[6662]: Lock CR index = 0x94e70f0 Apr 9 05:05:18 rockingham slapd[6662]: TEMPLATE 0x94e70f0 QUERIES-- 5 Apr 9 05:05:18 rockingham slapd[6662]: Unlock CR index = 0x94e70f0 Apr 9 05:05:18 rockingham slapd[6662]: STALE QUERY REMOVED, SIZE=0 Apr 9 05:05:18 rockingham slapd[6662]: STORED QUERIES = 26 Apr 9 05:05:18 rockingham slapd[6662]: STALE QUERY REMOVED, CACHE =17 entries Apr 9 05:05:18 rockingham slapd[6662]: Lock CR index = 0x94e70f0 Apr 9 05:05:18 rockingham slapd[6662]: TEMPLATE 0x94e70f0 QUERIES-- 4 Apr 9 05:05:18 rockingham slapd[6662]: Unlock CR index = 0x94e70f0 Apr 9 05:05:18 rockingham slapd[6662]: STALE QUERY REMOVED, SIZE=0 Apr 9 05:05:18 rockingham slapd[6662]: STORED QUERIES = 25 Apr 9 05:05:18 rockingham slapd[6662]: STALE QUERY REMOVED, CACHE =17 entries Apr 9 05:05:18 rockingham slapd[6662]: Lock CR index = 0x94e70f0 Apr 9 05:05:18 rockingham slapd[6662]: TEMPLATE 0x94e70f0 QUERIES-- 3 Apr 9 05:05:18 rockingham slapd[6662]: Unlock CR index = 0x94e70f0 Apr 9 05:05:18 rockingham slapd[6662]: STALE QUERY REMOVED, SIZE=0 Apr 9 05:05:18 rockingham slapd[6662]: STORED QUERIES = 24 Apr 9 05:05:18 rockingham slapd[6662]: STALE QUERY REMOVED, CACHE =17 entries Apr 9 05:05:18 rockingham slapd[6662]: Lock CR index = 0x94e70f0 Apr 9 05:05:18 rockingham slapd[6662]: TEMPLATE 0x94e70f0 QUERIES-- 2 Apr 9 05:05:18 rockingham slapd[6662]: Unlock CR index = 0x94e70f0 Apr 9 05:05:18 rockingham slapd[6662]: STALE QUERY REMOVED, SIZE=0 Apr 9 05:05:18 rockingham slapd[6662]: STORED QUERIES = 23 Apr 9 05:05:18 rockingham slapd[6662]: STALE QUERY REMOVED, CACHE =17 entries Apr 9 05:05:18 rockingham slapd[6662]: Lock CR index = 0x94e70f0 Apr 9 05:05:18 rockingham slapd[6662]: TEMPLATE 0x94e70f0 QUERIES-- 1 Apr 9 05:05:18 rockingham slapd[6662]: Unlock CR index = 0x94e70f0 Apr 9 05:05:18 rockingham slapd[6662]: STALE QUERY REMOVED, SIZE=0 Apr 9 05:05:18 rockingham slapd[6662]: STORED QUERIES = 22 Apr 9 05:05:18 rockingham slapd[6662]: STALE QUERY REMOVED, CACHE =17 entries Apr 9 05:05:18 rockingham slapd[6662]: Lock CR index = 0x94e70f0 Apr 9 05:05:18 rockingham slapd[6662]: TEMPLATE 0x94e70f0 QUERIES-- 0 Apr 9 05:05:18 rockingham slapd[6662]: Unlock CR index = 0x94e70f0 Apr 9 05:05:18 rockingham slapd[6662]: STALE QUERY REMOVED, SIZE=0 Apr 9 05:05:18 rockingham slapd[6662]: STORED QUERIES = 21 Apr 9 05:05:18 rockingham slapd[6662]: STALE QUERY REMOVED, CACHE =17 entries Apr 9 05:05:18 rockingham slapd[6662]: Lock CR index = 0x94e6eb0
Here's the backtrace:
Program terminated with signal 11, Segmentation fault. #0 0x081764ed in pcache_query_cmp (v1=0xb1f0e900, v2=0x83e58955) at pcache.c:694 694 return pcache_filter_cmp( q1->first, q2->first ); (gdb) bt #0 0x081764ed in pcache_query_cmp (v1=0xb1f0e900, v2=0x83e58955) at pcache.c:694 #1 0x08199b6d in tavl_delete (root=0x825d7d0, data=0xb1f0e900, fcmp=0x81764d8<pcache_query_cmp>) at tavl.c:202 #2 0x08177ab4 in remove_from_template (qc=0xb1f0e900, template=0x94e6eb0) at pcache.c:1302 #3 0x0817b57d in consistency_check (ctx=0xb44201d0, arg=0x9537c50) at pcache.c:2597 #4 0x0819fd01 in ldap_int_thread_pool_wrapper (xpool=0x94a8fa0) at tpool.c:663 #5 0x009c546b in start_thread () from /lib/libpthread.so.0 #6 0x0091cdbe in clone () from /lib/libc.so.6 (gdb) (gdb) p q1->first $3 = (Filter *) 0xb1f0d4c0 (gdb) p q2->first Cannot access memory at address 0x83e58959
Let me know what else would be useful from this backtrace.
Nothing obvious is jumping out here. Can you run a few of these slapds with libefence or valgrind?