Hi,
We use a single master and two read-only replicas; we use back-bdb on all systems. Each read-only replica replicates from the master with syncrepl, configured to refreshAndPersist. During a particularly heavy update load recently, replication on one of the read-only replicas started to fail due to a misconfigured DB_CONFIG. The replica wrote the following messages to its log repeatedly:
Dec 14 04:01:08 pip-dev slapd[12645]: bdb(dc=csupomona,dc=edu): Lock table is out of available lock entries Dec 14 04:01:08 pip-dev slapd[12645]: => bdb_idl_delete_key: c_get failed: Cannot allocate memory (12) Dec 14 04:01:08 pip-dev slapd[12645]: conn=-1 op=0: attribute "memberUid" index delete failure Dec 14 04:01:08 pip-dev slapd[12645]: null_callback : error code 0x50 Dec 14 04:01:08 pip-dev slapd[12645]: syncrepl_entry: rid=001 be_modify failed (80) Dec 14 04:01:08 pip-dev slapd[12645]: do_syncrepl: rid=001 rc 80 retrying
as it tried and failed to start replication again.
Shortly after, the master slapd crashed, writing nothing to its log indicating why (or even referencing the crash at all). We initially noticed this behavior with a 2.4.26 master and a 2.4.28 read-only replica (we came upon this problem while performing some maintenance, which is why there's a version mismatch). I reproduced the problem on a 2.4.28 master while researching ITS #7113 [1] (which describes this problem more precisely and in more detail). Has anyone else run into this issue? Is there a good way to insulate the master slapd from misconfigured replicas? Our replicas shouldn't break like this (we've tuned our DB_CONFIG to ensure that they don't in the future), and hopefully slapd can be modified so that the master doesn't crash even if replicas do break, but we'd rather not have to worry about our master crashing if our DB_CONFIG proves inadequate in the meantime.
[1] http://www.openldap.org/its/index.cgi/Incoming?id=7113
Thanks for any help,