After having deployed delta-sync MMR at several customer sites, the general handling of conflict resolution in MMR mode is significantly sub optimal, and routinely causes the MMR nodes to get further out of sync, worsening things significantly (Mainly due to ITS#8125).
The main issues I see are the following:
a) Two masters get different change requests at approximately the same time to add a value X to an attribute.
b) Two masters get different change requests at approximately the same time to delete a value X from an attribute.
In these two specific cases, in relaxed mode, rather than falling back and re-syncing the entire database, I think the conflict should be discarded (skipped), and logged as such. I.e., there is no actual discrepancy in the object. It still has X present in the add case, and X gone in the delete case.
At best, if we're going to do fallback, then we should only see about resyncing the specific entry. The overall behavior I'm seeing from OpenLDAP is the masters get in an endless cycle of re-sync, and the more they do so, the more out of sync they become, leading to a point at which you have to stop all masters, export all their DBs, sort them, find missing entries between all sets of masters, and build a brand new DB with which to reload them, until they get massively out of sync again. I.e., the current strategy of resync is doing no favors to anyone. It may work OK on very small DBs, where a resync only takes seconds, but on larger dbs were such syncs take 30+ minutes to hours, it is not a useful methodology.
--Quanah
--
Quanah Gibson-Mount Platform Architect Zimbra, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration