ondra@mistotebe.net wrote:
This is my understanding of the above discussion:
- deltasync consumer has just switched to full refresh (but is ahead from this provider in some ways)
- provider sends the present list
- consumer deletes extra entries, builds a new cookie
- problem is that the new cookie is built to reflect the union of both the local and received cookies even though we may have undone some of the changes which we then ignore
If that's accurate, there are some approaches that could fix it:
Simple one is to remember the actual cookie we got from the server and refuse to delete entries with entryCSN ahead of the provided CSN set. Problem is that we get even further from being able to replicate from a generic RFC4533 provider.
Instead, when present phase is initiated, we might terminate all other sessions, adopt the complete CSN set and restart them only once the new CSN set has been fully established.
(2) makes sense.
Also, whenever we fall back from deltasync into plain syncrepl, we should make sure that the accesslog entries we generate from this are never used for further replication which might be thought to be a separate issue.
That should already be the case, since none of these ops will have a valid CSN.
Maybe the ITS#8486 work might be useful for this if we have a way of signalling to accesslog to reset minCSN accordingly to the new CSN set.
The former is simpler, but the latter feels like the only one that actually addresses these problems in full.