--On Monday, May 11, 2015 7:32 PM +0000 quanah@zimbra.com wrote:
--On Friday, May 08, 2015 7:56 PM +0000 quanah@zimbra.com wrote:
--On May 4, 2015 at 8:00:45 PM +0000 openldap-its@OpenLDAP.org wrote:
In addition to the missing entries, multiple differences exist for existing entries as well. At this point, it seems that replication with 2.4.41 is entirely unsafe.
The existing entry drift seems to be related to the locked up replication that occured because of a bug in openssl. So the real issue is just the entries that weren't created, which is a current known limitation of syncrepl refresh that needs fixing.
Unfortunately, I'm now seeing this at a second customer site with 2-node MMR, where it is running current RE24 (2.4.41).
In this case:
ldap01 and ldap02 were stopped openldap was upgraded from 2.4.39 to 2.4.41 ldap01's db was slapcat'd ldap02's db was erased ldap02's db was recreated from ldap01's slapcat
Now they are having drift, and constant refresh troubles:
[zimbra@ldap01-dc01 ~]$ grep REFRESH /var/log/zimbra.log May 15 08:56:02 ldap01-dc01 slapd[31479]: do_syncrep2: rid=201 delta-sync lost sync on (reqStart=20150515065528.000299Z,cn=accesslog), switching to REFRESH May 15 08:58:07 ldap01-dc01 slapd[31479]: do_syncrep2: rid=201 LDAP_RES_INTERMEDIATE - REFRESH_DELETE May 15 09:57:18 ldap01-dc01 slapd[31479]: do_syncrep2: rid=201 delta-sync lost sync on (reqStart=20150515075718.000081Z,cn=accesslog), switching to REFRESH May 15 10:46:57 ldap01-dc01 slapd[31479]: do_syncrep2: rid=201 delta-sync lost sync on (reqStart=20150515084653.000032Z,cn=accesslog), switching to REFRESH May 15 12:43:18 ldap01-dc01 slapd[31479]: do_syncrep2: rid=201 delta-sync lost sync on (reqStart=20150515104318.000049Z,cn=accesslog), switching to REFRESH May 15 16:32:19 ldap01-dc01 slapd[31479]: do_syncrep2: rid=201 delta-sync lost sync on (reqStart=20150515143211.000250Z,cn=accesslog), switching to REFRESH May 15 16:36:41 ldap01-dc01 slapd[31479]: do_syncrep2: rid=201 delta-sync lost sync on (reqStart=20150515143556.000046Z,cn=accesslog), switching to REFRESH May 15 17:23:10 ldap01-dc01 slapd[31479]: do_syncrep2: rid=201 delta-sync lost sync on (reqStart=20150515152310.000002Z,cn=accesslog), switching to REFRESH
[zimbra@ldap02-dc01 xx]$ grep FRESH /var/log/zimbra.log May 15 08:35:46 ldap02-dc01 slapd[18167]: do_syncrep2: rid=201 delta-sync lost sync on (reqStart=20150515063539.000009Z,cn=accesslog), switching to REFRESH May 15 08:47:44 ldap02-dc01 slapd[18167]: do_syncrep2: rid=201 delta-sync lost sync on (reqStart=20150515064733.000043Z,cn=accesslog), switching to REFRESH May 15 08:56:03 ldap02-dc01 slapd[18167]: do_syncrep2: rid=201 delta-sync lost sync on (reqStart=20150515065528.000234Z,cn=accesslog), switching to REFRESH May 15 08:57:24 ldap02-dc01 slapd[18167]: do_syncrep2: rid=201 LDAP_RES_INTERMEDIATE - REFRESH_DELETE May 15 09:31:12 ldap02-dc01 slapd[18167]: do_syncrep2: rid=201 delta-sync lost sync on (reqStart=20150515073112.000086Z,cn=accesslog), switching to REFRESH May 15 10:46:58 ldap02-dc01 slapd[18167]: do_syncrep2: rid=201 delta-sync lost sync on (reqStart=20150515084653.000003Z,cn=accesslog), switching to REFRESH May 15 13:44:11 ldap02-dc01 slapd[18167]: do_syncrep2: rid=201 delta-sync lost sync on (reqStart=20150515114405.000313Z,cn=accesslog), switching to REFRESH May 15 14:54:49 ldap02-dc01 slapd[18167]: do_syncrep2: rid=201 delta-sync lost sync on (reqStart=20150515125448.000002Z,cn=accesslog), switching to REFRESH May 15 17:23:11 ldap02-dc01 slapd[18167]: do_syncrep2: rid=201 delta-sync lost sync on (reqStart=20150515152311.000035Z,cn=accesslog), switching to REFRESH
This is caused by modify ops:
[zimbra@ldap02-dc01 xx]$ grep (20) /var/log/zimbra.log May 15 08:35:46 ldap02-dc01 slapd[18167]: syncrepl_message_to_op: rid=201 be_modify uid=xxx,ou=people,dc=xxx,dc=com (20) May 15 08:47:44 ldap02-dc01 slapd[18167]: syncrepl_message_to_op: rid=201 be_modify uid=xxx,ou=people,dc=yyy,dc=co,dc=za (20) May 15 08:56:03 ldap02-dc01 slapd[18167]: syncrepl_message_to_op: rid=201 be_modify uid=xxx,ou=people,dc=zzz,dc=co,dc=za (20) May 15 09:31:12 ldap02-dc01 slapd[18167]: syncrepl_message_to_op: rid=201 be_modify uid=xxx,ou=people,dc=www,dc=co,dc=za (20) May 15 10:46:58 ldap02-dc01 slapd[18167]: syncrepl_message_to_op: rid=201 be_modify uid=xxx,ou=people,dc=ppp,dc=co,dc=za (20) May 15 13:44:11 ldap02-dc01 slapd[18167]: syncrepl_message_to_op: rid=201 be_modify uid=xxx,ou=people,dc=rrr,dc=co,dc=za (20) May 15 14:54:49 ldap02-dc01 slapd[18167]: syncrepl_message_to_op: rid=201 be_modify uid=xxx,ou=people,dc=zzz,dc=com (20) May 15 17:23:11 ldap02-dc01 slapd[18167]: syncrepl_message_to_op: rid=201 be_modify uid=xxx,ou=people,dc=zzz,dc=com (20)
So overall, there seems to be significant problems with RE24 in relation to replication at this time.
Clocks are tightly in sync.
--Quanah
--
Quanah Gibson-Mount Platform Architect Zimbra, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration