I'm still seeing cases where deleted entries are getting resurrected when a number of concurrent Add/Delete sequences are occurring, with multiple MMR servers (4 minimum to show the error).
The problem begins because multiple writes are outstanding, and they are replicated in persist mode without a CSN in their syncrepl cookie. This is a normal occurrence when the current op does not correspond to the last committed CSN.
Because there is no CSN, the consumer doesn't update its cookie state while performing a particular op.
As a result, if a client does Add/Delete/Add/Delete of the same DN, it's possible for the Adds to propagate several times (more than the client actually executed).
Adds and Modifies can usually be rejected if they're too old, because they carry an entryCSN attribute which can be compared against the existing entry, even if the consumer cookie state has not been updated. But Deletes don't carry any attributes, and Deleted entries can't be checked.
So, given a MMR setup like so:
1 -- 2 | | 3 -- 4
A sequence of Add/Del/Add/Del performed at server 1 will be replicated to both 2 and 3 immediately. They will then cascade it to server 4. If many other writes were occurring at the same time, causing these writes to be propagated without a cookie CSN, then server 4 will propagate them back to 3 and 2 respectively, and 3 and 2 will re-add the deleted entries because they have nothing to check that says the Adds are old. This cycle only gets broken if server 1 eventually sends an op with accompanying cookie update, so that all the downstream servers can see that the ops are old.
...
OK, upon further digging, this appears to be caused by ITS#6024. rein's patch prevents the consumer and provider from informing each other of their SIDs when no CSN is present; this prevents syncprov's propagation loop detection from working. Sigh. Reverting ITS#6024 patch...
On 13.02.12 00:47, Howard Chu wrote:
Long time with little OpenLDAP work, but I'm still around ;-)
I'm still seeing cases where deleted entries are getting resurrected when a number of concurrent Add/Delete sequences are occurring, with multiple MMR servers (4 minimum to show the error).
Just for the record, this is not the problem reported in this ITS, the ITS bug is the same as discussed in this thread:
http://www.openldap.org/lists/openldap-devel/201012/msg00018.html
The queuing of an old CSN done as a fix to ITS#7052 may have introduced a new race condition, an ITS and fix is coming.
I would prefer a rewrite so that only the frontend assigned CSNs to operations though. The current situation where syncrepl attaches the entryCSN or old CSNs to the operation just to prevent the backend from generating new CSNs appears to me like curing the symptom rather than the sickness..
The problem begins because multiple writes are outstanding, and they are replicated in persist mode without a CSN in their syncrepl cookie. This is a normal occurrence when the current op does not correspond to the last committed CSN.
This looks to me as the root of the problem seen here. Replicating without CSN implies replicating possible incomplete state, and when there are multiple paths by which these operations can reach a server we end up with race conditions.
I'd prefer that all changes replicated in persist mode carried a single CSN and were replicated in CSN order (for all CSNs with the same SID that is). It is probably sufficient to enforce this in MMR mode though.
The replicated changes are already being serialized, so serializing them in CSN order shouldn't stall things noticeably and would eliminate the type of race conditions seen here. And I guess it's already required for delta replication?
The major drawback would be that after a refresh syncprov would have to force its consumers to refresh as well. I.e the first hop in a chain would have to complete its refresh before the next hop starts seeing the updates. But database consistency is most important to me, so I would have no problem living with that.
Because there is no CSN, the consumer doesn't update its cookie state while performing a particular op.
As a result, if a client does Add/Delete/Add/Delete of the same DN, it's possible for the Adds to propagate several times (more than the client actually executed).
Adds and Modifies can usually be rejected if they're too old, because they carry an entryCSN attribute which can be compared against the existing entry, even if the consumer cookie state has not been updated. But Deletes don't carry any attributes, and Deleted entries can't be checked.
So, given a MMR setup like so:
1 -- 2 | | 3 -- 4
A sequence of Add/Del/Add/Del performed at server 1 will be replicated to both 2 and 3 immediately. They will then cascade it to server 4. If many other writes were occurring at the same time, causing these writes to be propagated without a cookie CSN, then server 4 will propagate them back to 3 and 2 respectively, and 3 and 2 will re-add the deleted entries because they have nothing to check that says the Adds are old. This cycle only gets broken if server 1 eventually sends an op with accompanying cookie update, so that all the downstream servers can see that the ops are old.
There are actually two possible race conditions in this configuration, when an add/delete is performed on the same DN:
1) The add is sent without CSN, the delete with. Assume that the add/delete is handled by server 3 before it receives them from server 4. It will then act upon the CSN-less add and discard the delete as already being seen, and end up with an entry not present on the origin server.
2) Neither the add nor the delete are sent with a CSN. This can lead to the endless add/delete cycle outlined above when there exist loops in the MMR topology. The cycle will only be broken if the same DN is re-added with a CSN, updating the CSN by changing other entries is not sufficient. The wild CSN-less add will be stopped when it reaches a server with the newly added entry, and hence also the delete. But which servers that will end up acting on the delete is yet another race condition :-(
Hm, given that the replication handles add and modify fairly equal, could a modify/delete sequence be sufficient to trigger these race conditions?
OK, upon further digging, this appears to be caused by ITS#6024. rein's patch prevents the consumer and provider from informing each other of their SIDs when no CSN is present; this prevents syncprov's propagation loop detection from working. Sigh. Reverting ITS#6024 patch...
Unfortunately, this will not fix scenario 1 and only scenario 2 when all loops includes the server initiating the change. The rid and sid fields of the cookie are not sufficient for loop detection in the general case, and as such should only be used for optimization.
A new test script which exercise these race conditions is coming.
Rein
Rein Tollevik wrote:
On 13.02.12 00:47, Howard Chu wrote:
Long time with little OpenLDAP work, but I'm still around ;-)
Glad to hear from you ;)
I'm still seeing cases where deleted entries are getting resurrected when a number of concurrent Add/Delete sequences are occurring, with multiple MMR servers (4 minimum to show the error).
Just for the record, this is not the problem reported in this ITS, the ITS bug is the same as discussed in this thread:
http://www.openldap.org/lists/openldap-devel/201012/msg00018.html
The queuing of an old CSN done as a fix to ITS#7052 may have introduced a new race condition, an ITS and fix is coming.
OK.
I would prefer a rewrite so that only the frontend assigned CSNs to operations though. The current situation where syncrepl attaches the entryCSN or old CSNs to the operation just to prevent the backend from generating new CSNs appears to me like curing the symptom rather than the sickness..
That doesn't seem practical, since syncrepl operates behind the frontend. (As well as many other overlays, etc.)
The problem begins because multiple writes are outstanding, and they are replicated in persist mode without a CSN in their syncrepl cookie. This is a normal occurrence when the current op does not correspond to the last committed CSN.
This looks to me as the root of the problem seen here. Replicating without CSN implies replicating possible incomplete state, and when there are multiple paths by which these operations can reach a server we end up with race conditions.
I'd prefer that all changes replicated in persist mode carried a single CSN and were replicated in CSN order (for all CSNs with the same SID that is). It is probably sufficient to enforce this in MMR mode though.
The replicated changes are already being serialized, so serializing them in CSN order shouldn't stall things noticeably and would eliminate the type of race conditions seen here. And I guess it's already required for delta replication?
Yes, that works. I've tested with a mutex in place to guarantee this. It seemed like a pretty heavy restriction, so I didn't move forward with this approach. Also I didn't arrange it so serialization was only enforced per SID; that might be more suitable.
And yes, delta replication already does this since accesslog does full serialization.
The major drawback would be that after a refresh syncprov would have to force its consumers to refresh as well. I.e the first hop in a chain would have to complete its refresh before the next hop starts seeing the updates. But database consistency is most important to me, so I would have no problem living with that.
We've talked about this in the past as a desirable feature. Unfortunately it would impose a noticeable startup delay before downstream consumers see any updates, and in MMR it's a recipe for instant deadlock.
Because there is no CSN, the consumer doesn't update its cookie state while performing a particular op.
As a result, if a client does Add/Delete/Add/Delete of the same DN, it's possible for the Adds to propagate several times (more than the client actually executed).
Adds and Modifies can usually be rejected if they're too old, because they carry an entryCSN attribute which can be compared against the existing entry, even if the consumer cookie state has not been updated. But Deletes don't carry any attributes, and Deleted entries can't be checked.
So, given a MMR setup like so:
1 -- 2 | | 3 -- 4
A sequence of Add/Del/Add/Del performed at server 1 will be replicated to both 2 and 3 immediately. They will then cascade it to server 4. If many other writes were occurring at the same time, causing these writes to be propagated without a cookie CSN, then server 4 will propagate them back to 3 and 2 respectively, and 3 and 2 will re-add the deleted entries because they have nothing to check that says the Adds are old. This cycle only gets broken if server 1 eventually sends an op with accompanying cookie update, so that all the downstream servers can see that the ops are old.
There are actually two possible race conditions in this configuration, when an add/delete is performed on the same DN:
- The add is sent without CSN, the delete with. Assume that the
add/delete is handled by server 3 before it receives them from server 4. It will then act upon the CSN-less add and discard the delete as already being seen, and end up with an entry not present on the origin server.
- Neither the add nor the delete are sent with a CSN. This can lead to
the endless add/delete cycle outlined above when there exist loops in the MMR topology. The cycle will only be broken if the same DN is re-added with a CSN, updating the CSN by changing other entries is not sufficient. The wild CSN-less add will be stopped when it reaches a server with the newly added entry, and hence also the delete. But which servers that will end up acting on the delete is yet another race condition :-(
Hm, given that the replication handles add and modify fairly equal, could a modify/delete sequence be sufficient to trigger these race conditions?
That would require a modify of a nonexistent entry on at least one of the servers. I don't think you'll see this...
OK, upon further digging, this appears to be caused by ITS#6024. rein's patch prevents the consumer and provider from informing each other of their SIDs when no CSN is present; this prevents syncprov's propagation loop detection from working. Sigh. Reverting ITS#6024 patch...
Unfortunately, this will not fix scenario 1 and only scenario 2 when all loops includes the server initiating the change. The rid and sid fields of the cookie are not sufficient for loop detection in the general case, and as such should only be used for optimization.
A new test script which exercise these race conditions is coming.
--On Thursday, February 23, 2012 9:13 PM +0100 Rein Tollevik rein@OpenLDAP.org wrote:
The queuing of an old CSN done as a fix to ITS#7052 may have introduced a new race condition, an ITS and fix is coming.
A new test script which exercise these race conditions is coming.
Rein,
I was about to issue a testing call for RE24 for 2.4.30. If this ITS & fix are going to be coming soon, I will hold off. Do you have an ETA on them?
Thanks, Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
On 23.02.12 22:33, Quanah Gibson-Mount wrote:
--On Thursday, February 23, 2012 9:13 PM +0100 Rein Tollevik rein@OpenLDAP.org wrote:
The queuing of an old CSN done as a fix to ITS#7052 may have introduced a new race condition, an ITS and fix is coming.
A new test script which exercise these race conditions is coming.
I was about to issue a testing call for RE24 for 2.4.30. If this ITS & fix are going to be coming soon, I will hold off. Do you have an ETA on them?
After having analyzed my logs a bit more I'm fairly sure the 7052 fix didn't introduce any new races afterall :-) It's just confusing how "old" CSNs are attatched to OPs that are received without any. I.e, my changes should wait for the next release.
Rein
Rein Tollevik wrote:
On 23.02.12 22:33, Quanah Gibson-Mount wrote:
--On Thursday, February 23, 2012 9:13 PM +0100 Rein Tollevik rein@OpenLDAP.org wrote:
The queuing of an old CSN done as a fix to ITS#7052 may have introduced a new race condition, an ITS and fix is coming.
A new test script which exercise these race conditions is coming.
I was about to issue a testing call for RE24 for 2.4.30. If this ITS& fix are going to be coming soon, I will hold off. Do you have an ETA on them?
After having analyzed my logs a bit more I'm fairly sure the 7052 fix didn't introduce any new races afterall :-)
No new races, but some old ones still remain. So far the only thing I've found to reliably prevent them is to serialize all writes in syncprov. Have not committed this yet; wondering if there's some other conditions we should place on this. E.g., only if mirrormode is configured?