OpenLDAP 2.3.32 (our policy is to run STABLE unless there's a bugfix we need).
Most of our sites replicate direct to each other (SyncRepl; you need to know that data for a country is mastered in that country), except for one situation:
A <-> B <-> C
A and C are masters for their data, and B is a pure slave. For political reasons (i.e. it won't get fixed) A and C cannot replicate direct.
Because a schema change was not made on B, some updated data on A did not get through. All well and good, we fix the schema on B, and wait for the update (we use refreshAndPersist).
Except it never happened. Blowing away the slave on B caused it to update (of course), except it still never reached C, until it in turn was repopulated.
Am I looking at a replication bug? It seems to me that once the schema was fixed, the replication should have happened. Or am I not understanding how SyncRepl works?
--On Friday, April 20, 2007 11:55 AM +1000 Dave Horsfall daveh@ci.com.au wrote:
OpenLDAP 2.3.32 (our policy is to run STABLE unless there's a bugfix we need).
Most of our sites replicate direct to each other (SyncRepl; you need to know that data for a country is mastered in that country), except for one situation:
A <-> B <-> C
A and C are masters for their data, and B is a pure slave. For political reasons (i.e. it won't get fixed) A and C cannot replicate direct.
Because a schema change was not made on B, some updated data on A did not get through. All well and good, we fix the schema on B, and wait for the update (we use refreshAndPersist).
Except it never happened. Blowing away the slave on B caused it to update (of course), except it still never reached C, until it in turn was repopulated.
Am I looking at a replication bug? It seems to me that once the schema was fixed, the replication should have happened. Or am I not understanding how SyncRepl works?
This sounds strikingly similar to a bug I've encountered in the past with delta-syncrepl where the CSN was incorrectly updated after a failed MOD (due to differences because the replicas had an overlay on that the master didn't). I've had it on my to-do to really get the logs for this, but have been busy on other things. I'll see if I can set some time aside to re-produce this and get the necessary information so it can be fixed.
--Quanah
-- Quanah Gibson-Mount Senior Systems Software Developer ITS/Shared Application Services Stanford University GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html
On Thu, 19 Apr 2007, Quanah Gibson-Mount wrote:
Am I looking at a replication bug? It seems to me that once the schema was fixed, the replication should have happened. Or am I not understanding how SyncRepl works?
This sounds strikingly similar to a bug I've encountered in the past with delta-syncrepl where the CSN was incorrectly updated after a failed MOD (due to differences because the replicas had an overlay on that the master didn't). I've had it on my to-do to really get the logs for this, but have been busy on other things. I'll see if I can set some time aside to re-produce this and get the necessary information so it can be fixed.
Thanks, Quanah; it's nice to know that I'm not alone :-)
In the meantime, I still have the "slapcat" backups around that time if they're of use to anyone.
--On Friday, April 20, 2007 12:55 PM +1000 Dave Horsfall daveh@ci.com.au wrote:
On Thu, 19 Apr 2007, Quanah Gibson-Mount wrote:
Am I looking at a replication bug? It seems to me that once the schema was fixed, the replication should have happened. Or am I not understanding how SyncRepl works?
This sounds strikingly similar to a bug I've encountered in the past with delta-syncrepl where the CSN was incorrectly updated after a failed MOD (due to differences because the replicas had an overlay on that the master didn't). I've had it on my to-do to really get the logs for this, but have been busy on other things. I'll see if I can set some time aside to re-produce this and get the necessary information so it can be fixed.
Thanks, Quanah; it's nice to know that I'm not alone :-)
In the meantime, I still have the "slapcat" backups around that time if they're of use to anyone.
Hi Dave,
I tested and verified that this error has been fixed since the 2.3.32 release. If you upgrade, you should not encounter this problem in the future.
Regards, Quanah
-- Quanah Gibson-Mount Senior Systems Software Developer ITS/Shared Application Services Stanford University GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html
On Fri, 20 Apr 2007, Quanah Gibson-Mount wrote:
I tested and verified that this error has been fixed since the 2.3.32 release. If you upgrade, you should not encounter this problem in the future.
Thanks, Quanah; I'll look at upgrading on the morrow.
On Sun, 22 Apr 2007, Dave Horsfall wrote:
I tested and verified that this error has been fixed since the 2.3.32 release. If you upgrade, you should not encounter this problem in the future.
Thanks, Quanah; I'll look at upgrading on the morrow.
I confirm that 2.3.35 indeed fixes the problem.
Is there any reason why 2.3.35 has not been flagged as STABLE? I see I'm not the only one bound by policy :-)
Of course, if there's bugs in 2.3.35 then I'd like to know about them...
--On Tuesday, April 24, 2007 10:31 AM +1000 Dave Horsfall daveh@ci.com.au wrote:
On Sun, 22 Apr 2007, Dave Horsfall wrote:
I tested and verified that this error has been fixed since the 2.3.32 release. If you upgrade, you should not encounter this problem in the future.
Thanks, Quanah; I'll look at upgrading on the morrow.
I confirm that 2.3.35 indeed fixes the problem.
Is there any reason why 2.3.35 has not been flagged as STABLE? I see I'm not the only one bound by policy :-)
Of course, if there's bugs in 2.3.35 then I'd like to know about them...
See:
http://www.stanford.edu/services/directory/openldap/configuration/openldap-build.html
My policy is, if I know a particular release fixes issues I'm having, and I've verified that it doesn't introduce any immediate new ones in dev/test/uat, then it goes to prod. To me, that is much more sane than a rather arbitrary tag.
--Quanah
-- Quanah Gibson-Mount Senior Systems Software Developer ITS/Shared Application Services Stanford University GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html
Dave Horsfall wrote:
On Sun, 22 Apr 2007, Dave Horsfall wrote:
I tested and verified that this error has been fixed since the 2.3.32 release. If you upgrade, you should not encounter this problem in the future.
Thanks, Quanah; I'll look at upgrading on the morrow.
I confirm that 2.3.35 indeed fixes the problem.
Is there any reason why 2.3.35 has not been flagged as STABLE? I see I'm not the only one bound by policy :-)
Of course, if there's bugs in 2.3.35 then I'd like to know about them...
Yes, ITS#4925 comes to mind. We'll probably release 2.3.36 fairly soon. In the meantime, I expect to finish merging GNUtls support in the next few days so that we can release a 2.4 beta.
On Tue, 24 Apr 2007, Howard Chu wrote:
Of course, if there's bugs in 2.3.35 then I'd like to know about them...
Yes, ITS#4925 comes to mind. We'll probably release 2.3.36 fairly soon. In the meantime, I expect to finish merging GNUtls support in the next few days so that we can release a 2.4 beta.
We don't use NOOP controls, but thanks for the heads-up.
openldap-software@openldap.org