--On Friday, March 04, 2011 9:08 PM +0000 hyc@symas.com wrote:
dhawes@vt.edu wrote:
On 03/03/2011 02:38 PM, Quanah Gibson-Mount wrote:
--On Thursday, March 03, 2011 7:34 PM +0000 dhawes@vt.edu wrote:
Full_Name: David Hawes Version: 2.4.24 OS: Ubuntu 10.04 URL: Submission from: (NULL) (128.173.39.26)
When using slapadd or slapindex with the -q option, the message "Closing DB..." is printed and then the application hangs indefinitely. Removing the -q option allows the application to complete without issue.
This occurs with Berkeley DB 4.7.25 (with patches) and 5.1.25.
I would ask you provide a full backtrace of the slapadd process after it has hung. Otherwise, this report isn't of much use.
Also, if you are using the Ubuntu patches for OpenLDAP with your OpenLDAP build, you are including a known database-corrupting patch. Since you don't say how you built OpenLDAP, it is impossible for us to know if you did this or not.
Both OpenLDAP and Berkeley DB are compiled from source. No Ubuntu packages or code is used.
Backtraces (I may need to recompile without optimization):
(gdb) thread apply all bt
Thread 2 (Thread 0x7ffee9003700 (LWP 29225)): # 0 0x00007ffff763c85c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 # 1 0x00000000004b4150 in bdb_tool_trickle_task (ctx=<value optimized # out>, ptr=<value optimized out>) at tools.c:1253 # 2 0x00000000005066b0 in ldap_int_thread_pool_wrapper ( xpool=<value optimized out>) at tpool.c:685 # 3 0x00007ffff76379ca in start_thread () from /lib/libpthread.so.0 # 4 0x00007ffff677970d in clone () from /lib/libc.so.6 # 5 0x0000000000000000 in ?? ()
This indicates that the trickle task is still waiting for a signal on its condition variable. Which is a bit odd since bdb_tool_entry_close() already signals it before slap_tool_destroy() is called.
It might be illuminating to run slapadd under gdb with a breakpoint on bdb_tool_entry_close(), and singlestep through the first few lines of that function where it issues the signal, and see if the trickle task actually reacts or not.
Thread 1 (Thread 0x7ffff7fd9700 (LWP 29220)): # 0 0x00007ffff763c85c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 # 1 0x0000000000506223 in ldap_pvt_thread_pool_destroy (tpool=0x7ffffffed658, run_pending=<value optimized out>) at tpool.c:582 # 2 0x0000000000506a0a in ldap_int_thread_pool_shutdown () at # tpool.c:181 3 0x00000000005050a9 in ldap_pvt_thread_destroy () at # threads.c:70 4 0x0000000000466059 in slap_destroy () at init.c:273 # 5 0x00000000004a5ade in slap_tool_destroy () at slapcommon.c:932 # 6 0x00000000004a46e7 in slapadd (argc=0, argv=<value optimized out>) at slapadd.c:606 # 7 0x000000000041edc0 in main (argc=4, argv=0x7fffffffe048) at # main.c:407
We are seeing numerous reports of this occurring with Zimbra after using OpenLDAP 2.4.23 + the multi-core fix (ITS#6660)
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration