https://bugs.openldap.org/show_bug.cgi?id=10358
Issue ID: 10358 Summary: syncrepl can revert an entry's CSN Product: OpenLDAP Version: unspecified Hardware: All OS: All Status: UNCONFIRMED Keywords: needs_review Severity: normal Priority: --- Component: slapd Assignee: bugs@openldap.org Reporter: ondra@mistotebe.net Target Milestone: ---
Created attachment 1080 --> https://bugs.openldap.org/attachment.cgi?id=1080&action=edit Debug log of an instance of this happening
There is a sequence of operations which can force a MPR node to apply changes out of order (essentially reverting an operation). Currently investigating which part of the code that should have prevented this has let it slip.
A sample log showing how this happened is attached.
https://bugs.openldap.org/show_bug.cgi?id=10358
--- Comment #1 from Ondřej Kuzník ondra@mistotebe.net --- On Tue, Jun 17, 2025 at 10:31:00AM +0000, openldap-its@openldap.org wrote:
There is a sequence of operations which can force a MPR node to apply changes out of order (essentially reverting an operation). Currently investigating which part of the code that should have prevented this has let it slip.
A sample log showing how this happened is attached.
What we need for plain MPR is not to apply a modification twice (we do this by tracking the latest applied/pending CSN per serverid) and apply only the latest version of an entry when we learn of it - we use the entryCSN to enforce serialisation across the cluster (for each entry we keep the latest version).
It is the latter that fails in this case, the syncrepl task picks up first, comes all the way down to preparing the modification based on the version of the entry it sees, a Modify request comes in, gets applied, then syncrepl gets scheduled again and gets applied, reverting the entryCSN. There is no synchronisation between the two, nor is there an opportunity to make sure the syncrepl thread is not interrupted after diffing the entry before it gets applied to the DB.
We could go all in and introduce another way to serialise or we could just tag the diff with a precondition that the entry has not been changed in the meantime and detect it. Either by making the entryCSN mod a delete+add or use assertion control. With assertion control, we can ignore LDAP_ASSERTION_FAILED, with plain mods we would have to start ignoring LDAP_NO_SUCH_ATTRIBUTE - excluding bugs, the only way this could happen is getting preempted before the mod is applied. All this only for messages handled by syncrepl_diff_entry (not delta modifications which have syncrepl_op_modify() to deal with MPR concerns).
I will see if the delete+add approach is practicable and post a MR to that effect if it is.