--On Saturday, September 03, 2016 4:51 PM +0000 quanah@zimbra.com wrote:
--On Saturday, September 03, 2016 6:15 AM +0000 quanah@openldap.org wrote:
Full_Name: Quanah Gibson-Mount Version: 2.4.44+ITS8432 OS: Linux 2.6 URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (75.111.52.177)
Trying to reproduce another ITS, I discovered a new bug. When doing MODRDN ops on one master, the other master keeps going out of sync. Specifically:
Sep 3 01:12:17 zre-ldap002 slapd[29206]: syncrepl_message_to_op: rid=100 be_modrdn uid=user.924,ou=people,dc=zre-ldap002,dc=eng,dc=zimbra,dc=com (32) Sep 3 01:12:17 zre-ldap002 slapd[29206]: do_syncrep2: rid=100 delta-sync lost sync on (reqStart=20160903051215.747829Z,cn=accesslog), switching to REFRESH
Note that this master also has a replica. The replica never rejected a single one of these MODRDNs coming from this master. Which means that either:
a) The data on the master spontaneously corrupted at some point
or
b) The master wrote the MODRDNs to the accesslog, which the replica picked up, but did not itself make the MODRDN changes to its database.
In the end, of the 50,000 MODRDNs it was processing, it threw an error 32 for 441 of them.
After the master that was not accepting direct writes re-sync'd with the master accepting writes, it still had 403/50000 entries wrong. So did its replica. So the master isn't writing the changes to the accesslog. So it's option c. The master rejects a valid op, never sync's correctly, and in the end 2/3rds of my servers have invalid databases.
I see zero indication that using a sessionlog works around http://www.openldap.org/its/index.cgi/?findid=8125 at all. I still end up with missed entries even with everything *in* the sessionlog.
--Quanah
--
Quanah Gibson-Mount