Hello,
I have a pair of OpenLDAP servers that had been replicating flawlessly with delta syncRepl for about 10 months. Just the other day, I saw that modifications were no longer being replicated and these messages were appearing in the syslog on the master server immediately after the MOD line:
[ID 651871 local0.debug] => bdb_idl_insert_key: c_get next_dup failed: DB_NOTFOUND: No matching key/data pair found (-30990) [ID 809268 local0.debug] => bdb_dn2id_add: parent (cn=log) insert failed: -30990
I assume that something has become corrupted in the BDB database for cn=log on the master. Does that seem correct? I'm definitely not seeing any new entries in the cn=log database since those messages began appearing.
If it is a corrupted index, I think that running "slapindex -b cn=log -f .... " after stopping the slapd process will fix that. After that completes, I should be able to restart the slapd and test that writes to entries under the baseDN do cause new entries to appear in the cn=log database.
If it's not an index, I have no idea how to repair this. I found the error message in the sources (servers/slapd/back-bdb/idl.c:789 in version 2.3.30) but honestly, I have no idea what that code is doing.
Once (if) I can repair things, I can begin worrying about getting changes to the replica again. Since there are changes missing from the cn=log database on the master, I assume that I'll need to cause a complete re-sync. Is there a better way to accomplish that than removing the entire database on the replica, using slapadd to import a recent backup of the master, and restarting the replica?
Some specifics in case they matter:
Master: Solaris10 amd64 BDB 4.2.52 + 5 patches OpenLDAP 2.3.30
Replica: Solaris10 amd64 BDB 4.2.52 + 5 patches OpenLDAP 2.3.38 (upgraded from 2.3.33 the day before the problem began on the Master)
(What I believe to be the) Relevant portions of slapd.conf file from the Master (slightly obfuscated) are included at the end of this message.
Thank you for any help,
-Ben
# access log database (used by syncprov-delta replication) database bdb suffix "cn=log" directory /var/openldap/data/prod/logdb rootdn "cn=Manager,dc=our,dc=domain" mode 0660 shm_key 142 index default eq index objectClass,entryUUID,entryCSN eq index reqStart,reqEnd,reqResult,reqType eq access to dn.subtree="cn=log" by group.exact="cn=DirectoryAdmins,cn=administrators,dc=our,dc=domain" write by dn.onelevel="cn=SyncUsers,cn=administrators,dc=our,dc=domain" read by * none
overlay syncprov syncprov-nopresent TRUE syncprov-reloadhint TRUE
# This is all one line limits dn.onelevel="cn=SyncUsers,cn=administrators,dc=our,dc=domain" time.soft=unlimited time.hard=unlimited size.soft=unlimited size.hard=unlimited
database hdb suffix "dc=our,dc=domain" rootdn "cn=manager,dc=our,dc=domain" rootpw {SHA}[XXX REMOVED XXX] directory /var/openldap/data/prod/db checkpoint 100000 30 mode 0660 shm_key 42 cachesize 500000 idlcacheSize 1500000 index default pres,eq index givenName,description,uid,cn,sn pres,eq,sub index objectClass,uniqueMember,member eq index employeeNumber eq,sub index entryCSN,entryUUID eq
overlay ppolicy ppolicy_default cn=standard,cn=policies,dc=our,dc=domain
overlay dynlist dynlist-attrset groupOfURLs memberURL member
overlay syncprov syncprov-checkpoint 100000 30 syncprov-sessionlog 300000
overlay accesslog logdb cn=log logops writes logsuccess TRUE logold (objectClass=inetOrgPerson) logpurge 28+00:00 01+00:00
# This is all one line limits dn.onelevel="cn=SyncUsers,cn=administrators,dc=our,dc=domain" time.soft=unlimited time.hard=unlimited size.soft=unlimited size.hard=unlimited