Full_Name: Rein Tollevik Version: 2.4.16 OS: linux URL: Submission from: (NULL) (81.93.160.250) Submitted by: rein
I have had a couple of seg. faults in back-bdb cache.c during full resync of a consumer. The default bdb *cachesize configuration was being used (which was way too low now that cache limits are enforced more strictly..). Some info from two of the core files are listed below, the core files are available for further analyze.
Rein Tollevik Basefarm AS
Core 1:
Program terminated with signal 11, Segmentation fault. #0 0x0000002a9677ac3c in pthread_mutex_lock () from /lib64/tls/libpthread.so.0 (gdb) where #0 0x0000002a9677ac3c in pthread_mutex_lock () from /lib64/tls/libpthread.so.0 #1 0x0000002a9567eaa3 in ldap_pvt_thread_mutex_lock (mutex=0x38) at thr_posix.c:296 #2 0x00000000004be2ae in bdb_cache_delete_internal (cache=0x8e1970, e=0x2ac86a0b60, decr=0) at cache.c:1372 #3 0x00000000004beafb in bdb_cache_lru_purge (bdb=0x8e1910) at cache.c:776 #4 0x00000000004bf686 in bdb_cache_find_id (op=0x410006b0, tid=0xe82bd0, id=1, eip=<value optimized out>, flag=0, lock=0x40e7f4d0) at cache.c:1053 #5 0x00000000004c3563 in bdb_dn2entry (op=0x410006b0, tid=0xe82bd0, dn=<value optimized out>, e=0x40e7f500, matched=1, lock=0x40e7f4d0) at dn2entry.c:67 #6 0x00000000004ab5ac in bdb_search (op=0x410006b0, rs=0x40ffffb0) at search.c:374 #7 0x0000000000496b9a in overlay_op_walk (op=0x410006b0, rs=0x40ffffb0, which=<value optimized out>, oi=0x8e20a0, on=0x0) at backover.c:669 #8 0x00000000004975a6 in over_op_func (op=0x410006b0, rs=0x1, which=op_bind) at backover.c:721 #9 0x000000000049769e in over_op_search (op=0x38, rs=0x2ac86a0b60) at backover.c:743 #10 0x0000000000494728 in glue_sub_search (op=0x410006b0, rs=0x40ffffb0, b0=0x0, on=0x8e0a40) at backglue.c:342 #11 0x00000000004953b5 in glue_op_search (op=0x410006b0, rs=0x40ffffb0) at backglue.c:465 #12 0x0000000000496b47 in overlay_op_walk (op=0x410006b0, rs=0x40ffffb0, which=<value optimized out>, oi=0x8e5860, on=0x8d88e0) at backover.c:659 #13 0x00000000004975a6 in over_op_func (op=0x410006b0, rs=0x1, which=op_bind) at backover.c:721 #14 0x000000000049769e in over_op_search (op=0x38, rs=0x2ac86a0b60) at backover.c:743 #15 0x000000000048f3ba in syncrepl_entry (si=0x8e5520, op=0x410006b0, entry=0x9f74d8, modlist=0x41000458, syncstate=1, syncUUID=0x41000500, syncCSN=0x0) at syncrepl.c:2132 #16 0x00000000004925ff in do_syncrep2 (op=0x410006b0, si=0x8e5520) at syncrepl.c:892 #17 0x0000000000494166 in do_syncrepl (ctx=<value optimized out>, arg=0x8e52f0) at syncrepl.c:1361 #18 0x00000000004316d4 in connection_read_thread (ctx=0x41000e10, argv=<value optimized out>) at connection.c:1225 #19 0x0000002a9567e297 in ldap_int_thread_pool_wrapper (xpool=<value optimized out>) at tpool.c:663 #20 0x0000002a96779137 in start_thread () from /lib64/tls/libpthread.so.0 #21 0x0000002a9694f543 in clone () from /lib64/tls/libc.so.6 #22 0x0000000000000000 in ?? () (gdb) frame 2 #2 0x00000000004be2ae in bdb_cache_delete_internal (cache=0x8e1970, e=0x2ac86a0b60, decr=0) at cache.c:1372 1372 bdb_cache_entryinfo_lock( e->bei_parent ); (gdb) print *e $1 = {bei_parent = 0x0, bei_id = 14820, bei_lockpad = 0, bei_state = 128, bei_finders = 0, bei_nrdn = {bv_len = 7, bv_val = 0x0}, bei_e = 0x0, bei_kids = 0x0, bei_kids_mutex = {__m_reserved = 2, __m_count = 0, __m_owner = 0x10000186c, __m_kind = 0, __m_lock = {__status = 0, __spinlock = 0}}, bei_lrunext = 0xabf230, bei_lruprev = 0x0} (gdb) frame 3 #3 0x00000000004beafb in bdb_cache_lru_purge (bdb=0x8e1910) at cache.c:776 776 bdb_cache_delete_internal( &bdb->bi_cache, elru, 0 ); (gdb) print eicount $2 = 0 (gdb) print eifree $3 = 10
Core 2:
Program terminated with signal 11, Segmentation fault. #0 0x0000002a9677ac3c in pthread_mutex_lock () from /lib64/tls/libpthread.so.0 (gdb) where #0 0x0000002a9677ac3c in pthread_mutex_lock () from /lib64/tls/libpthread.so.0 #1 0x0000002a9567eaa3 in ldap_pvt_thread_mutex_lock (mutex=0x38) at thr_posix.c:296 #2 0x00000000004be2ae in bdb_cache_delete_internal (cache=0x8e1970, e=0x2acdee8af0, decr=0) at cache.c:1372 #3 0x00000000004beafb in bdb_cache_lru_purge (bdb=0x8e1910) at cache.c:776 #4 0x00000000004bf686 in bdb_cache_find_id (op=0x418003a0, tid=0xacf790, id=10150, eip=<value optimized out>, flag=0, lock=0x4167fb90) at cache.c:1053 #5 0x00000000004acde5 in bdb_search (op=0x418003a0, rs=0x41800330) at search.c:706 #6 0x0000000000496b9a in overlay_op_walk (op=0x418003a0, rs=0x41800330, which=<value optimized out>, oi=0x8e20a0, on=0x0) at backover.c:669 #7 0x00000000004975a6 in over_op_func (op=0x418003a0, rs=0x1, which=op_bind) at backover.c:721 #8 0x000000000049769e in over_op_search (op=0x38, rs=0x2acdee8af0) at backover.c:743 #9 0x0000000000494728 in glue_sub_search (op=0x418003a0, rs=0x41800330, b0=0x0, on=0x8e0a40) at backglue.c:342 #10 0x00000000004953b5 in glue_op_search (op=0x418003a0, rs=0x41800330) at backglue.c:465 #11 0x0000000000496b47 in overlay_op_walk (op=0x418003a0, rs=0x41800330, which=<value optimized out>, oi=0x8e5860, on=0x8d88e0) at backover.c:659 #12 0x00000000004975a6 in over_op_func (op=0x418003a0, rs=0x1, which=op_bind) at backover.c:721 #13 0x000000000049769e in over_op_search (op=0x38, rs=0x2acdee8af0) at backover.c:743 #14 0x0000002a9a71c5b9 in syncprov_findcsn (op=0xdc83f0, mode=FIND_PRESENT) at syncprov.c:707 #15 0x0000002a9a71d806 in syncprov_op_search (op=0xdc83f0, rs=0x41801cc0) at syncprov.c:2484 #16 0x0000000000496b47 in overlay_op_walk (op=0xdc83f0, rs=0x41801cc0, which=<value optimized out>, oi=0x8e5860, on=0x8d8dd0) at backover.c:659 #17 0x00000000004975a6 in over_op_func (op=0xdc83f0, rs=0x1, which=op_bind) at backover.c:721 #18 0x000000000049769e in over_op_search (op=0x38, rs=0x2acdee8af0) at backover.c:743 #19 0x000000000043292a in fe_op_search (op=0xdc83f0, rs=0x41801cc0) at search.c:366 #20 0x000000000043336c in do_search (op=0xdc83f0, rs=0x41801cc0) at search.c:217 #21 0x0000000000430775 in connection_operation (ctx=0x41801e10, arg_v=<value optimized out>) at connection.c:1097 #22 0x00000000004316c8 in connection_read_thread (ctx=0x41801e10, argv=<value optimized out>) at connection.c:1223 #23 0x0000002a9567e297 in ldap_int_thread_pool_wrapper (xpool=<value optimized out>) at tpool.c:663 #24 0x0000002a96779137 in start_thread () from /lib64/tls/libpthread.so.0 #25 0x0000002a9694f543 in clone () from /lib64/tls/libc.so.6 #26 0x0000000000000000 in ?? () (gdb) frame 2 #2 0x00000000004be2ae in bdb_cache_delete_internal (cache=0x8e1970, e=0x2acdee8af0, decr=0) at cache.c:1372 1372 bdb_cache_entryinfo_lock( e->bei_parent ); (gdb) print *e $1 = {bei_parent = 0xa775e0, bei_id = 10151, bei_lockpad = 0, bei_state = 128, bei_finders = 0, bei_nrdn = {bv_len = 11, bv_val = 0x2acec4b630 "dc=ud-www01"}, bei_e = 0x0, bei_kids = 0x0, bei_kids_mutex = {__m_reserved = 1, __m_count = 0, __m_owner = 0x100001789, __m_kind = 0, __m_lock = {__status = 0, __spinlock = 0}}, bei_lrunext = 0xd671c0, bei_lruprev = 0xd671c0}