--On Friday, April 20, 2007 11:55 AM +1000 Dave Horsfall daveh@ci.com.au wrote:
OpenLDAP 2.3.32 (our policy is to run STABLE unless there's a bugfix we need).
Most of our sites replicate direct to each other (SyncRepl; you need to know that data for a country is mastered in that country), except for one situation:
A <-> B <-> C
A and C are masters for their data, and B is a pure slave. For political reasons (i.e. it won't get fixed) A and C cannot replicate direct.
Because a schema change was not made on B, some updated data on A did not get through. All well and good, we fix the schema on B, and wait for the update (we use refreshAndPersist).
Except it never happened. Blowing away the slave on B caused it to update (of course), except it still never reached C, until it in turn was repopulated.
Am I looking at a replication bug? It seems to me that once the schema was fixed, the replication should have happened. Or am I not understanding how SyncRepl works?
This sounds strikingly similar to a bug I've encountered in the past with delta-syncrepl where the CSN was incorrectly updated after a failed MOD (due to differences because the replicas had an overlay on that the master didn't). I've had it on my to-do to really get the logs for this, but have been busy on other things. I'll see if I can set some time aside to re-produce this and get the necessary information so it can be fixed.
--Quanah
-- Quanah Gibson-Mount Senior Systems Software Developer ITS/Shared Application Services Stanford University GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html