Hello list.
I have two openldap servers 2.4.23 running in a multimaster setup whose content diverged at some point. Fortunatly, I had different backup files for both, which allowed to me to reconciliate content manually. I reimported the resulting ldif file in one of the server, re-exported it through slapcat, and reimported it in the other server, after dropping all existing databases (including acceslog one).
However, I end up with two bases with multiple and divergent contextCSN values: contextCSN: 20120420144311.217351Z#000000#000#000000 contextCSN: 20120903132738.849382Z#000000#001#000000 contextCSN: 20120903132426.927826Z#000000#002#000000
contextCSN: 20120420144311.217351Z#000000#000#000000 contextCSN: 20120903132426.925966Z#000000#001#000000 contextCSN: 20120903132924.793560Z#000000#002#000000
Any attempt to manually reduce this to a single value in the backup file before restoration ended in infinite refresh request from the second servers and "stale cookie" error messages.
So, my questions are: - is this an expectable state to have multiple values for contextCSN ? - does it hurt, beyond making synchronisation checking almost impossible ? - how to return to a stable situation ?
I can't change ldap version easily, and I'd rather return to classic master-slave setup if the problem is not fixable otherwise.
--On Monday, September 03, 2012 3:52 PM +0200 Guillaume Rousse guillomovitch@gmail.com wrote:
I can't change ldap version easily, and I'd rather return to classic master-slave setup if the problem is not fixable otherwise.
I would advise you to read the CHANGES file and look at all the fixes for Syncrepl and MMR since 2.4.23 was released, and then hopefully you realize that either upgrading the *patch* level or going back to classic provider-replica setup is the wise choice.
Since the CHANGES file is publicly available on the web, I'm somewhat unclear as to why you haven't already read it over and come to a very quick and easy conclusion.
http://www.openldap.org/software/release/changes.html
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
Hello, Quanah,
I do not understand the irritation. I have read the changes.html file and I do not see any entry saying "fixed Multi-Master replication as it was totally damned. UPGRADE NOW OR YOUR DATA WILL GO DOWN".
The truth that I have subscribed to this list because of seeing a similar problem like Guillaume reported and I did have a look at the changelog.
Is OpenLDAP THAT broken below 2.4.32?
Cheers, Ballock
--On Tuesday, September 04, 2012 11:20 AM +0200 ballock boleslaw.tokarski@tieto.com wrote:
Hello, Quanah,
I do not understand the irritation. I have read the changes.html file and I do not see any entry saying "fixed Multi-Master replication as it was totally damned. UPGRADE NOW OR YOUR DATA WILL GO DOWN".
The truth that I have subscribed to this list because of seeing a similar problem like Guillaume reported and I did have a look at the changelog.
Is OpenLDAP THAT broken below 2.4.32?
No such text would ever appear in the changes log. However, someone who has some basic reading skills can probably discern the following fixes are likely to impact their installation. Particularly the fix specific to MMR in 2.4.24. Given the fact that MMR uses syncrepl, one then has to track all the fixes made to syncrpel/syncprov as well.
OpenLDAP 2.4.30 Release (2012/02/29) Fixed slapo-syncprov loop detection (ITS#6024)
OpenLDAP 2.4.29 Release (2012/02/12) Fixed slapd syncrepl reference to freed memory (ITS#7127,ITS#7132) Fixed slapd syncrepl to ignore some errors on delete (ITS#7052) Fixed slapd syncrepl to handle missing oldRDN (ITS#7144) Fixed slapo-syncprov with already abandoned operation (ITS#7150)
OpenLDAP 2.4.27 Release (2011/11/24) Fixed slapd syncrepl crash with non-replicated ops (ITS#6892) Fixed slapd syncrepl with modrdn (ITS#7000,ITS#6472) Fixed slapd syncrepl timeout when using refreshAndPersist (ITS#6999) Fixed slapd syncrepl deletes need a non-empty CSN (ITS#7052) Fixed slapd syncrepl glue for empty suffix (ITS#703) Fixed slapo-syncprov DSA attribute filtering for Persist mode (ITS#7019) Fixed slapo-syncprov when consumer has newer state of our SID (ITS#7040) Fixed slapo-syncprov crash (ITS#7025)
OpenLDAP 2.4.26 Release (2011/06/30) Fixed slapd syncrepl crash with non-replicated ops (ITS#6892) Fixed slapo-syncprov with replicated subtrees (ITS#6872)
OpenLDAP 2.4.24 Release (2011/02/10) Fixed slapd sortvals of attributes with 1 value (ITS#6715) Fixed slapd syncrepl reuse of presence list (ITS#6707) Fixed slapd syncrepl uninitialized return code (ITS#6719) Fixed slapd syncrepl variable initialization (ITS#6739) Fixed slapd syncrepl refresh to use complete cookie (ITS#6807) Fixed slapo-syncprov to send error if consumer is newer (ITS#6606) Fixed slapo-syncprov filter race condition (ITS#6708) Fixed slapo-syncprov active mod race (ITS#6709) Fixed slapo-syncprov to refresh if context is dirty (ITS#6710) Fixed slapo-syncprov CSN updates to all replicas (ITS#6718) Fixed slapo-syncprov sessionlog ordering (ITS#6716) Fixed slapo-syncprov sessionlog with adds (ITS#6503) Fixed slapo-syncprov mutex (ITS#6438) Fixed slapo-syncprov mincsn check with MMR (ITS#6717) Fixed slapo-syncprov control leak (ITS#6795) Fixed slapo-syncprov error codes (ITS#6812)
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
Le 03/09/2012 22:45, Quanah Gibson-Mount a écrit :
--On Monday, September 03, 2012 3:52 PM +0200 Guillaume Rousse guillomovitch@gmail.com wrote:
I can't change ldap version easily, and I'd rather return to classic master-slave setup if the problem is not fixable otherwise.
I would advise you to read the CHANGES file and look at all the fixes for Syncrepl and MMR since 2.4.23 was released, and then hopefully you realize that either upgrading the *patch* level or going back to classic provider-replica setup is the wise choice.
Since the CHANGES file is publicly available on the web, I'm somewhat unclear as to why you haven't already read it over and come to a very quick and easy conclusion.
I did, of course.
However, just the number of issues isn't enough to distinguish between corner-case and immediate issues. Given than I've been running ldap server 2.4.19 in multi-master modes for years without effective problems, I'm quite used to moderate the usual 'you should never run anything else than the latest version otherwise everything will blow up' advice. Also, I'm in a very low-concurrency setup, and I could easily make sure 99% of write operations always apply on a single server.
BTW, I jsut figured than multiple CSNcontext values was perfectly normal: one is expected per data provided, and the third value (the one labelled with rid 0) was due to artifacts of my initial dataset.
--On Tuesday, September 04, 2012 12:12 PM +0200 Guillaume Rousse guillomovitch@gmail.com wrote:
I did, of course.
However, just the number of issues isn't enough to distinguish between corner-case and immediate issues. Given than I've been running ldap server 2.4.19 in multi-master modes for years without effective problems, I'm quite used to moderate the usual 'you should never run anything else than the latest version otherwise everything will blow up' advice. Also, I'm in a very low-concurrency setup, and I could easily make sure 99% of write operations always apply on a single server.
This isn't the general "it is wisest to run the latest patch level" advice. If you read the changes, there are numerous, *significant* fixes to syncrepl in general (all of which affect syncrepl MMR) and ones specific to syncrepl MMR as well. You noted you had divergence in your database. I personally would want to do what I could to avoid such a state.
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
openldap-technical@openldap.org