I've seen this now with two different masters, using BDB 4.2.52 + patches,
which we long considered stable:
Thread 3 (Thread 1115703648 (LWP 31507)):
#0 0x0000003fca90bd9c in __pread_nocancel () from
/lib64/tls/libpthread.so.0
#1 0x0000002a97348edc in __os_io (dbenv=0xdc3100, op=1, fhp=0xf7d2d0,
pgno=71531, pagesize=4096, buf=0x2cb08f8cf0 "_3", niop=0x42681b70) at
../dist/../os/os_rw.c:55
#2 0x0000002a973405ea in __memp_pgread (dbmfp=0xeb5780,
mutexp=0x2cbba53150, bhp=0x2cb08f8c58, can_create=0) at
../dist/../mp/mp_bh.c:219
#3 0x0000002a973413f7 in __memp_fget (dbmfp=0xeb5780, pgnoaddr=0x42681cc4,
flags=0, addrp=0x42681cc8) at ../dist/../mp/mp_fget.c:580
#4 0x0000002a972d18b3 in __bam_search (dbc=0xdb1dc40, root_pgno=Variable
"root_pgno" is not available.
) at ../dist/../btree/bt_search.c:307
#5 0x0000002a972c6e27 in __bam_c_search (dbc=0xdb1dc40, root_pgno=0,
key=0x42681f40, flags=28, exactp=0x42681df4) at
../dist/../btree/bt_cursor.c:2547
#6 0x0000002a972c7b7e in __bam_c_get (dbc=0xdb1dc40, key=0x42681f40,
data=0x42681f20, flags=28, pgnop=0x42681e64) at
../dist/../btree/bt_cursor.c:962
#7 0x0000002a9730f51b in __db_c_get (dbc_arg=0x28ea700, key=0x42681f40,
data=0x42681f20, flags=28) at ../dist/../db/db_cam.c:643
#8 0x0000002a973177d7 in __db_c_get_pp (dbc=0x28ea700, key=0x42681f40,
data=0x42681f20, flags=28) at ../dist/../db/db_iface.c:1836
#9 0x0000002a97183dc2 in bdb_dn2id (op=0x428025e0, dn=0x42682028,
ei=0x42682010, locker=44, lock=0x42681fd0) at dn2id.c:315
#10 0x0000002a971896fd in bdb_cache_find_ndn (op=0x428025e0, locker=44,
ndn=0x291c370, res=0x42682310) at cache.c:341
#11 0x0000002a97189e39 in bdb_cache_find_id (op=0x428025e0, tid=0x0,
id=25145, eip=0x42682310, islocked=0, locker=44, lock=0x42682250) at
cache.c:716
#12 0x0000002a9717b1c7 in bdb_search (op=0x428025e0, rs=0x42802570) at
search.c:696
#13 0x0000002a975921eb in syncprov_findcsn (op=0x5ccd900,
mode=FIND_PRESENT) at syncprov.c:681
#14 0x0000002a9759690a in syncprov_op_search (op=0x5ccd900, rs=0x42803d60)
at syncprov.c:2074
#15 0x000000000049b935 in overlay_op_walk (op=0x5ccd900, rs=0x42803d60,
which=op_search, oi=0xded8c0, on=0xded700) at backover.c:640
#16 0x000000000049bb91 in over_op_func (op=0x5ccd900, rs=0x42803d60,
which=op_search) at backover.c:702
#17 0x000000000049bc27 in over_op_search (op=0x5ccd900, rs=0x42803d60) at
backover.c:724
#18 0x000000000042f1d8 in fe_op_search (op=0x5ccd900, rs=0x42803d60) at
search.c:355
#19 0x000000000042ecac in do_search (op=0x5ccd900, rs=0x42803d60) at
search.c:217
#20 0x000000000042bd9c in connection_operation (ctx=0x42803e90,
arg_v=0x5ccd900) at connection.c:1133
#21 0x000000000042c28c in connection_read_thread (ctx=0x42803e90,
argv=0x1e) at connection.c:1261
---Type <return> to continue, or q <return> to quit---
#22 0x0000002a956c7c77 in ldap_int_thread_pool_wrapper (xpool=0x8a1f00) at
tpool.c:478
#23 0x0000003fca90610a in start_thread () from /lib64/tls/libpthread.so.0
#24 0x0000003fca0c68c3 in clone () from /lib64/tls/libc.so.6
#25 0x0000000000000000 in ?? ()
Has anyone else seen anything like this? It seems there's possibly a bug
inside BDB 4.2.52 that gets hit occasionally.
--Quanah
--
Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc
--------------------
Zimbra :: the leader in open source messaging and collaboration