I'm trying to stabilize our openldap server farm before going live and am finding that despite the contextCSN matching between providers and replicas, the actual content of the server is getting out of sync. This is most prominent when we are testing our population routine and we need to remove all accounts before starting. right now it's only about 22000 entries (It will get much larger).
During the mass delete we got the following sprinkled throughout the logs on all machines: ==== Nov 15 15:47:16 idm-prod-ldap-2 slapd[33070]: bdb(dc=domain,dc=name): previous transaction deadlock return not resolved Nov 15 15:47:16 idm-prod-ldap-2 slapd[33070]: => bdb_idl_delete_key: cursor failed: Invalid argument (22)
and the various replicas would still have accounts left over but they wouldn't match each other.
Granted the above issues might be explained away in that we don't yet have enough ram on the machines yet, however it does seem to present us with a problem when we notice the discrepancy, how do we during run time re-sync the data from the provider server? I have tried the slapd -c rid=2,csn=20111114000000.000000Z but that doesn't seem to do any good. (I've tried several different values of csn=0 csn=20111114000000.000000Z#000000#000#000000 etc. to no avail)
I guess my question is two fold, how do I really verify replication is working properly and is in sync, and how to I force a replica to just take the current content from a provider without question. (I don't really want to remove the database and have it re-sync, rather have it go through and check the content and update as needed).
Thanks Jeffrey Crawford