Sorry about the stripped trace. I forgot that the install procedure always strips the binaries...
Okay, with our stress profile it takes ~36 hours to fail. I always start with a clean db rebuild before each run. Each failure produces the same traceback:
(gdb) where #0 0x00b97410 in __kernel_vsyscall () #1 0x00471d80 in raise () from /lib/libc.so.6 #2 0x00473691 in abort () from /lib/libc.so.6 #3 0x0046b1fb in __assert_fail () from /lib/libc.so.6 #4 0x0808d532 in ch_malloc (size=4436335) at ch_malloc.c:57
^^^ this really looks like memory exhaustion while trying to malloc a large chunk (>4MB). Can you tell, by printing e->e_name, whether it's correct that the server was modifying a large entry?
p.
#5 0x08079ad2 in entry_encode (e=0x3a3dac0, bv=0x3a3d9b0) at entry.c:742 #6 0x0815240e in bdb_id2entry_put (be=0x3a3dca0, tid=0xbc6f7378, e=0x3a3dac0, flag=0) at id2entry.c:54 #7 0x08152508 in hdb_id2entry_update (be=0x3a3dca0, tid=0xbc6f7378, e=0x3a3dac0) at id2entry.c:90 #8 0x08106374 in hdb_modify (op=0xdabbc28, rs=0x3a3f0e4) at modify.c:611 #9 0x080ea38e in overlay_op_walk (op=0xdabbc28, rs=0x3a3f0e4, which=op_modify, oi=0x8be2788, on=0x0) at backover.c:669 #10 0x080ea543 in over_op_func (op=0xdabbc28, rs=0x3a3f0e4, which=op_modify) at backover.c:721 #11 0x080ea60b in over_op_modify (op=0xdabbc28, rs=0x3a3f0e4) at backover.c:755 #12 0x08089151 in fe_op_modify (op=0xdabbc28, rs=0x3a3f0e4) at modify.c:301 #13 0x08088b90 in do_modify (op=0xdabbc28, rs=0x3a3f0e4) at modify.c:175 #14 0x0806be8f in connection_operation (ctx=0x3a3f1d0, arg_v=0xdabbc28) at connection.c:1115 #15 0x0806c3cf in connection_read_thread (ctx=0x3a3f1d0, argv=0x1a) at connection.c:1251 #16 0x081d8fa9 in ldap_int_thread_pool_wrapper (xpool=0x8b941b0) at tpool.c:685 #17 0x0043749b in start_thread () from /lib/libpthread.so.0 #18 0x0051a42e in clone () from /lib/libc.so.6