--On Saturday, September 03, 2016 4:51 PM +0000 quanah(a)zimbra.com wrote:
--On Saturday, September 03, 2016 6:15 AM +0000 quanah(a)openldap.org
wrote:
> Full_Name: Quanah Gibson-Mount
> Version: 2.4.44+ITS8432
> OS: Linux 2.6
> URL:
ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (75.111.52.177)
>
>
> Trying to reproduce another ITS, I discovered a new bug. When doing
> MODRDN ops on one master, the other master keeps going out of sync.
> Specifically:
>
> Sep 3 01:12:17 zre-ldap002 slapd[29206]: syncrepl_message_to_op: rid=100
> be_modrdn uid=user.924,ou=people,dc=zre-ldap002,dc=eng,dc=zimbra,dc=com
> (32) Sep 3 01:12:17 zre-ldap002 slapd[29206]: do_syncrep2: rid=100
> delta-sync lost sync on (reqStart=20160903051215.747829Z,cn=accesslog),
> switching to REFRESH
Note that this master also has a replica. The replica never rejected a
single one of these MODRDNs coming from this master. Which means that
either:
a) The data on the master spontaneously corrupted at some point
or
b) The master wrote the MODRDNs to the accesslog, which the replica
picked up, but did not itself make the MODRDN changes to its database.
In the end, of the 50,000 MODRDNs it was processing, it threw an error 32
for 441 of them.
After the master that was not accepting direct writes re-sync'd with the
master accepting writes, it still had 403/50000 entries wrong. So did its
replica. So the master isn't writing the changes to the accesslog. So
it's option c. The master rejects a valid op, never sync's correctly, and
in the end 2/3rds of my servers have invalid databases.
I see zero indication that using a sessionlog works around
<
http://www.openldap.org/its/index.cgi/?findid=8125> at all. I still end
up with missed entries even with everything *in* the sessionlog.
--Quanah
--
Quanah Gibson-Mount