Hi Michael,
On Mon, 12 Aug 2013, Michael Ströder wrote:
Christian Kratzer wrote:
On Mon, 12 Aug 2013, Ulrich Windl wrote:
<snipp/> >> I have always suspected that this is due to the specific setting of: >> >> syncprov-checkpoint <ops> <minutes> >> After a write operation has succeeded, write the contextCSN >> to the underlying database if <ops> write >> operations or more than <minutes> time have passed since the > >> last checkpoint. Checkpointing is disabled >> by default. >> >> Not sure though. > > do you "query" by slapcat or by an LDAP search? For the former it's documented > that contextCSN is updated lazily. For the latter I'm not sure.
I have had the same use case Michael is getting at in the back of me head for some time. I would also like to verify replication status by checking the contextCSN on all servers via an ldapsearch from a monitoring script.
I would expect an ldapsearch of the contextCSN to deliver a current and valid value that should be identical over all servers.
It seems this is not the case. I will run a couple of tests to verify this myself in my testbed of 2 mmr masters and 2 slaves.
Even worse my results with 3 MMR providers and 2 read-only consumer replicas are sometimes not very pleasant regarding data consistency. At the moment this test servers are running as VMs on ESX server. I will repeat my tests with real hardware to make sure there's nothing wrong because of the virtual environment.
I just verified the status on 5 relatively identical setups.
One with 3 MMR masters and 2 read only consumers.
The others with 2 MMR masters and 2 read only consumers.
The contextCSN of the data and cn=config databases were all in sync from what I could initially see.
I tried disturbing the peace by runnign some updates and restarting some slaves but could not immediately product a difference.
This is all openldap-2.4.35 on CentOS or RedHat EL running on a mix of hardware, VMWare and KVM virtualisiation.
These are all copies of the same setup with different data so the configuration is the same.
Do you have a checkup script I could deploy against my setups to closely monitor contextCSN values. Perhaps my manual checks were just too slow.
Greetings Christian