I have hit a replication problem that I am scratching my head over. We are in the midst of migrating from OpenLDAP from 2.3 to 2.4. In our test environment we have a 2.3.43 master with 2.4.18 and 2.3.35 slaves. Replication to the 2.4 slave stopped this morning. When I look at the log I see:
Oct 22 09:38:28 ldap-uat1 slapd[15032]: do_syncrep2: cookie=csn=20091022031949Z#000000#00#000000,rid=000 Oct 22 09:38:28 ldap-uat1 slapd[15032]: do_syncrep2: rid=000 CSN too old, ignoring 20091022031949.000000Z#000000#000#000000 Oct 22 09:38:28 ldap-uat1 slapd[15032]: do_syncrep2: cookie=csn=20091022031949Z#000001#00#000000,rid=000 Oct 22 09:38:28 ldap-uat1 slapd[15032]: do_syncrep2: rid=000 CSN too old, ignoring 20091022031949.000000Z#000001#000#000000 Oct 22 09:38:28 ldap-uat1 slapd[15032]: do_syncrep2: cookie=csn=20091022031949Z#000002#00#000000,rid=000 Oct 22 09:38:28 ldap-uat1 slapd[15032]: do_syncrep2: rid=000 CSN too old, ignoring 20091022031949.000000Z#000002#000#000000 Oct 22 09:38:28 ldap-uat1 slapd[15032]: do_syncrep2: cookie=csn=20091022031949Z#000003#00#000000,rid=000 Oct 22 09:38:28 ldap-uat1 slapd[15032]: do_syncrep2: rid=000 CSN too old, ignoring 20091022031949.000000Z#000003#000#000000 Oct 22 09:38:28 ldap-uat1 slapd[15032]: do_syncrep2: cookie=csn=20091022031949Z#000004#00#000000,rid=000 Oct 22 09:38:28 ldap-uat1 slapd[15032]: do_syncrep2: rid=000 CSN too old, ignoring 20091022031949.000000Z#000004#000#000000 Oct 22 09:38:28 ldap-uat1 slapd[15032]: do_syncrep2: cookie=csn=20091022031949Z#000005#00#000000,rid=000 Oct 22 09:38:28 ldap-uat1 slapd[15032]: slap_queue_csn: queing 0x7f09c283b075 20091022031949Z#000005#00#000000 Oct 22 09:38:28 ldap-uat1 slapd[15032]: syncrepl_message_to_op: rid=000 mods check (homePostalAddress: value #0 invalid per syntax) Oct 22 09:38:28 ldap-uat1 slapd[15032]: slap_graduate_commit_csn: removing 0x7f09c1621438 20091022031949Z#000005#00#000000
So, I slapcat'ed the master's database and looked for that contextCSN. The entry that appears to be the problem has a homePostalAddress of:
homePostalAddress: c\o Bain & Company, Two Copley Place, Boston, Massachusetts
We had hit this problem before when loading a 2.4 server using slapcat output from a 2.3 server and wrote a simple filter to correctly quote the backslash.
This is a problem for us because we expect to have a mixed 2.3/2.4 during the transistion to 2.4 on our production servers. The really nasty bit is that the only way I know for sure to fix this problem is to reload the slave. Is there an alterntive way?
Bill
--On Thursday, October 22, 2009 5:19 PM +0000 Bill MacAllister whm@stanford.edu wrote:
This is a problem for us because we expect to have a mixed 2.3/2.4 during the transistion to 2.4 on our production servers. The really nasty bit is that the only way I know for sure to fix this problem is to reload the slave. Is there an alterntive way?
Sanitize the data before it is written to the master.
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
Quanah Gibson-Mount wrote:
--On Thursday, October 22, 2009 5:19 PM +0000 Bill MacAllister whm@stanford.edu wrote:
This is a problem for us because we expect to have a mixed 2.3/2.4 during the transistion to 2.4 on our production servers. The really nasty bit is that the only way I know for sure to fix this problem is to reload the slave. Is there an alterntive way?
Sanitize the data before it is written to the master.
And think about what degree of "sanitization" you're going for. In your particular case, "c/o" is the common practice usage, not "c\o". I guess "c\o" is the product of a generation of people raised on Windows, who think "" is "slash" and don't realize that it is actually "back-slash", and they've been using it incorrectly all their lives...
openldap-technical@openldap.org