John Morrissey wrote:
To make a long story short, it seems that syncrepl doesn't update the backend's contextCSN until it's processed its backlog? To check, I stopped another consumer and let a backlog build, then started it at debuglevel 16384 and watched the backend's contextCSN with ldapsearch(1). contextCSN didn't increment until the backlog was completely processed, even though I could see the changes it was processing with ldapsearch(1) as soon as they were processed.
If a consumer processes replication without updating the backend's contextCSN, it will try to re-process the same replication entries when it starts up again, which will generally fail. This seems to leave one in a bind, either having to manually determine the correct value for contextCSN and update it manually, or remove the backend's data files and let syncrepl rebuild them from scratch. If this assessment is correct, this behavior doesn't seem desirable.
Your description of the behavior is correct. It's required to work that way with regular syncrepl; it probably should work differently for delta-sync but nobody pointed it out before.
In regular syncrepl the refresh phase does a regular LDAP search, whose results are returned in whatever arbitrary order the database normally retrieves things. Since that's pretty much guaranteed not to be the same as the order in which changes were made, we can't update the contextCSN until all of the changes have been received.
In delta-syncrepl the refresh is coming from the log, and unless somebody has been explicitly mucking around in the log DB, the entries will always be returned in order, so it's possible to update the contextCSN after each entry has been received. But it's up to the provider to send the cookie with each entry in this case, and the syncprov overlay doesn't really know the difference between delta-sync and regular sync, so it doesn't do it.
You could file an ITS for this, but I don't think we'll be changing this in 2.3.