--On Saturday, September 06, 2008 12:08 AM +0000 hyc@symas.com wrote:
ando@sys-net.it wrote:
h.b.furuseth@usit.uio.no wrote:
[quanah@freelancer tests]$ *** glibc detected *** /home/quanah/q/openldap-2.4.12/servers/slapd/.libs/lt-slapd: corrupted double-linked list: 0x000000000d28a430 ***
This message comes from glibc malloc. Its data structures are corrupted.
I couldn't figure out much from the the valgrind output. But you'll get more frequent mallocs with CPPFLAGS=-DSLAP_NO_SL_MALLOC, if that code isn't rotted - maybe that'll bring out the problem in another guise. If that code still works, I don't remember.
Here's my output after compiling with -DSLAP_NO_SL_MALLOC=1, running test039 under valgrind and suddenly killing the process:
==9657== Memcheck, a memory error detector. ==9657== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. ==9657== Using LibVEX rev 1658, a library for dynamic binary translation. ==9657== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. ==9657== Using valgrind-3.2.1, a dynamic binary instrumentation framework. ==9657== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. ==9657== For more details, rerun with: -v ==9657== @(#) $OpenLDAP: slapd 2.X (Sep 6 2008 01:35:19) $ masarati@odino.intra.sys-net.it:/home/masarati/Lavoro/sysnet/Ldap/ldap- devel/servers/slapd slapd starting daemon: shutdown requested and initiated. slapd shutdown: waiting for 8 threads to terminate ==9657== Thread 3: ==9657== Invalid read of size 4 ==9657== at 0x807EF03: slap_listener (daemon.c:1754) ==9657== by 0x807F9AC: slap_listener_thread (daemon.c:1997) ==9657== by 0x820F3D4: ldap_int_thread_pool_wrapper (tpool.c:663) ==9657== by 0xDEB46A: start_thread (in /lib/libpthread-2.5.so) ==9657== by 0xD42DBD: clone (in /lib/libc-2.5.so) ==9657== Address 0x415D3A8 is 32 bytes inside a block of size 164 free'd ==9657== at 0x4004FDA: free (vg_replace_malloc.c:233) ==9657== by 0x82407C9: ber_memfree_x (memory.c:152) ==9657== by 0x80A59E5: ch_free (ch_malloc.c:139) ==9657== by 0x807EDF8: close_listeners (daemon.c:1703) ==9657== by 0x808137F: slapd_daemon_task (daemon.c:2578) ==9657== by 0xDEB46A: start_thread (in /lib/libpthread-2.5.so) ==9657== by 0xD42DBD: clone (in /lib/libc-2.5.so) ==9657== ==9657== Invalid write of size 4 ==9657== at 0x807EF22: slap_listener (daemon.c:1759) ==9657== by 0x807F9AC: slap_listener_thread (daemon.c:1997) ==9657== by 0x820F3D4: ldap_int_thread_pool_wrapper (tpool.c:663) ==9657== by 0xDEB46A: start_thread (in /lib/libpthread-2.5.so) ==9657== by 0xD42DBD: clone (in /lib/libc-2.5.so) ==9657== Address 0x415D3A4 is 28 bytes inside a block of size 164 free'd ==9657== at 0x4004FDA: free (vg_replace_malloc.c:233) ==9657== by 0x82407C9: ber_memfree_x (memory.c:152) ==9657== by 0x80A59E5: ch_free (ch_malloc.c:139) ==9657== by 0x807EDF8: close_listeners (daemon.c:1703) ==9657== by 0x808137F: slapd_daemon_task (daemon.c:2578) ==9657== by 0xDEB46A: start_thread (in /lib/libpthread-2.5.so) ==9657== by 0xD42DBD: clone (in /lib/libc-2.5.so) ==9657== ==9657== Invalid read of size 4 ==9657== at 0x807EFCD: slap_listener (daemon.c:1781) ==9657== by 0x807F9AC: slap_listener_thread (daemon.c:1997) ==9657== by 0x820F3D4: ldap_int_thread_pool_wrapper (tpool.c:663) ==9657== by 0xDEB46A: start_thread (in /lib/libpthread-2.5.so) ==9657== by 0xD42DBD: clone (in /lib/libc-2.5.so) ==9657== Address 0x415D3A8 is 32 bytes inside a block of size 164 free'd ==9657== at 0x4004FDA: free (vg_replace_malloc.c:233) ==9657== by 0x82407C9: ber_memfree_x (memory.c:152) ==9657== by 0x80A59E5: ch_free (ch_malloc.c:139) ==9657== by 0x807EDF8: close_listeners (daemon.c:1703) ==9657== by 0x808137F: slapd_daemon_task (daemon.c:2578) ==9657== by 0xDEB46A: start_thread (in /lib/libpthread-2.5.so) ==9657== by 0xD42DBD: clone (in /lib/libc-2.5.so) daemon: accept(7) failed errno=9 (Bad file descriptor) ==9657== ==9657== Thread 5: ==9657== Invalid free() / delete / delete[] ==9657== at 0x4004FDA: free (vg_replace_malloc.c:233) ==9657== by 0x82407C9: ber_memfree_x (memory.c:152) ==9657== by 0x80E522B: slap_sl_free (sl_malloc.c:451) ==9657== by 0x80874AF: do_search (search.c:221) ==9657== by 0x8084347: connection_operation (connection.c:1084) ==9657== by 0x8084834: connection_read_thread (connection.c:1211) ==9657== by 0x820F3D4: ldap_int_thread_pool_wrapper (tpool.c:663) ==9657== by 0xDEB46A: start_thread (in /lib/libpthread-2.5.so) ==9657== by 0xD42DBD: clone (in /lib/libc-2.5.so) ==9657== Address 0x5595D60 is 0 bytes inside a block of size 29 free'd ==9657== at 0x4004FDA: free (vg_replace_malloc.c:233) ==9657== by 0x82407C9: ber_memfree_x (memory.c:152) ==9657== by 0x80A59E5: ch_free (ch_malloc.c:139) ==9657== by 0x81E1D64: rwm_op_cleanup (rwm.c:64) ==9657== by 0x80976CD: slap_cleanup_play (result.c:341) ==9657== by 0x8097DEA: send_ldap_response (result.c:522) ==9657== by 0x809850E: slap_send_ldap_result (result.c:642) ==9657== by 0x81855CE: ldap_back_op_result (bind.c:1823) ==9657== by 0x8184C3D: ldap_back_dobind_int (bind.c:1486) ==9657== by 0x8184D41: ldap_back_dobind (bind.c:1520) ==9657== by 0x812C9F7: ldap_back_search (search.c:174) ==9657== by 0x80F7DF4: glue_sub_search (backglue.c:340) ==9657== by 0x80F8458: glue_op_search (backglue.c:467) ==9657== by 0x80FACE3: overlay_op_walk (backover.c:657) ==9657== by 0x80FAF18: over_op_func (backover.c:719) ==9657== by 0x80FAF9C: over_op_search (backover.c:741) ==9657== by 0x8087ADB: fe_op_search (search.c:366) ==9657== by 0x80FAD63: overlay_op_walk (backover.c:667) ==9657== by 0x80FAF18: over_op_func (backover.c:719) ==9657== by 0x80FAF9C: over_op_search (backover.c:741) ==9657== ==9657== Invalid free() / delete / delete[] ==9657== at 0x4004FDA: free (vg_replace_malloc.c:233) ==9657== by 0x82407C9: ber_memfree_x (memory.c:152) ==9657== by 0x80E522B: slap_sl_free (sl_malloc.c:451) ==9657== by 0x80874D3: do_search (search.c:224) ==9657== by 0x8084347: connection_operation (connection.c:1084) ==9657== by 0x8084834: connection_read_thread (connection.c:1211) ==9657== by 0x820F3D4: ldap_int_thread_pool_wrapper (tpool.c:663) ==9657== by 0xDEB46A: start_thread (in /lib/libpthread-2.5.so) ==9657== by 0xD42DBD: clone (in /lib/libc-2.5.so) ==9657== Address 0x5595E60 is 0 bytes inside a block of size 29 free'd ==9657== at 0x4004FDA: free (vg_replace_malloc.c:233) ==9657== by 0x82407C9: ber_memfree_x (memory.c:152) ==9657== by 0x80A59E5: ch_free (ch_malloc.c:139) ==9657== by 0x81E1D90: rwm_op_cleanup (rwm.c:69) ==9657== by 0x80976CD: slap_cleanup_play (result.c:341) ==9657== by 0x8097DEA: send_ldap_response (result.c:522) ==9657== by 0x809850E: slap_send_ldap_result (result.c:642) ==9657== by 0x81855CE: ldap_back_op_result (bind.c:1823) ==9657== by 0x8184C3D: ldap_back_dobind_int (bind.c:1486) ==9657== by 0x8184D41: ldap_back_dobind (bind.c:1520) ==9657== by 0x812C9F7: ldap_back_search (search.c:174) ==9657== by 0x80F7DF4: glue_sub_search (backglue.c:340) ==9657== by 0x80F8458: glue_op_search (backglue.c:467) ==9657== by 0x80FACE3: overlay_op_walk (backover.c:657) ==9657== by 0x80FAF18: over_op_func (backover.c:719) ==9657== by 0x80FAF9C: over_op_search (backover.c:741) ==9657== by 0x8087ADB: fe_op_search (search.c:366) ==9657== by 0x80FAD63: overlay_op_walk (backover.c:667) ==9657== by 0x80FAF18: over_op_func (backover.c:719) ==9657== by 0x80FAF9C: over_op_search (backover.c:741) conn=3 op=1 ldap_back_retry: retrying URI="ldap://localhost:9011" DN="cn=manager,dc=example,dc=com" ==9657== Warning: invalid file descriptor -1 in syscall close() ==9657== Warning: invalid file descriptor -1 in syscall close() ==9657== Warning: invalid file descriptor -1 in syscall close() ==9657== Warning: invalid file descriptor -1 in syscall close() ==9657== Warning: invalid file descriptor -1 in syscall close() ==9657== Warning: invalid file descriptor -1 in syscall close() ==9657== Warning: invalid file descriptor -1 in syscall close() ==9657== Warning: invalid file descriptor -1 in syscall close() slapd stopped. ==9657== ==9657== ERROR SUMMARY: 7 errors from 5 contexts (suppressed: 36 from 1) ==9657== malloc/free: in use at exit: 1,127 bytes in 15 blocks. ==9657== malloc/free: 34,414 allocs, 34,403 frees, 12,376,432 bytes allocated. ==9657== For counts of detected errors, rerun with: -v ==9657== searching for pointers to 15 not-freed blocks. ==9657== checked 774,072 bytes.
Apparently, the listener is freed while some thread is still using it.
Since that can only occur during shutdown, I don't think it's really a high priority problem. More to the point, it will not cause any heap corruption. The serious problem is in thread 5, where rwm_op_cleanup frees a DN that do_search also tries to free. It looks like rwm_op_cleanup is badly broken here, it apparently should be checking to make sure ros->ro_dn is != ros->r_dn before freeing r_dn (and ndn).
With the latest rwm revision, I still get the glibc corruption:
PID=15912 - Search(500): base="o=Example,c=US" scope=sub filter="(cn=Alumni Assoc Staff)" attrs=cn (more...).
[quanah@tribes tests]$ *** glibc detected *** corrupted double-linked list: 0x0000003bf99316b8 ***
when I hit control-C during the search.
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration