--On Thursday, September 01, 2016 9:50 AM +0000 quanah@zimbra.com wrote:
I finally was able to reproduce this in a new lab of servers I set up. The easiest way to reproduce this problem is to have 2 masters, and 6 replicas. 3 replicas pointing to each master.
In my case, I have ldap07 through ldap014. 07/08 are the masters. Odd number replicas (9,11,13) point to 07, even number replicas point to 08 (10, 12, 14).
What we find based on the logs is two MODRDNs coming in at approximately the same time:
Sep 9 15:29:40 zre-ldap007 slapd[13794]: conn=1389 op=3179 MODRDN dn="uid=user.345,ou=people,dc=zre-ldap007,dc=eng,dc=zimbra,dc=com" Sep 9 15:29:40 zre-ldap007 slapd[13794]: conn=1120 op=2919 MODRDN dn="uid=user.250404,ou=people,dc=zre-ldap007,dc=eng,dc=zimbra,dc=com" Sep 9 15:29:40 zre-ldap007 slapd[13794]: slap_queue_csn: queueing 0xa046e80 20160909192940.809794Z#000000#001#000000 Sep 9 15:29:40 zre-ldap007 slapd[13794]: slap_queue_csn: queueing 0xa046a40 20160909192940.809794Z#000000#001#000000 Sep 9 15:29:40 zre-ldap007 slapd[13794]: slap_queue_csn: queueing 0xa07b740 20160909192940.810205Z#000000#001#000000 Sep 9 15:29:40 zre-ldap007 slapd[13794]: slap_graduate_commit_csn: removing 0xa046a40 20160909192940.809794Z#000000#001#000000 Sep 9 15:29:40 zre-ldap007 slapd[13794]: slap_queue_csn: queueing 0xa07bdc0 20160909192940.810205Z#000000#001#000000 Sep 9 15:29:40 zre-ldap007 slapd[13794]: conn=1389 op=3179 RESULT tag=109 err=0 etime=0.001278 text= Sep 9 15:29:40 zre-ldap007 slapd[13794]: slap_graduate_commit_csn: removing 0xa046e80 20160909192940.809794Z#000000#001#000000 Sep 9 15:29:40 zre-ldap007 slapd[13794]: slap_graduate_commit_csn: removing 0xa07bdc0 20160909192940.810205Z#000000#001#000000 Sep 9 15:29:40 zre-ldap007 slapd[13794]: conn=1120 op=2919 RESULT tag=109 err=0 etime=0.001597 text= Sep 9 15:29:40 zre-ldap007 slapd[13794]: slap_graduate_commit_csn: removing 0xa07b740 20160909192940.810205Z#000000#001#000000
But in the accesslog, we only find the one MODRDN logged:
dn: reqStart=20160909192940.809465Z,cn=accesslog objectClass: auditModRDN structuralObjectClass: auditModRDN reqStart: 20160909192940.809465Z reqEnd: 20160909192940.810148Z reqType: modrdn reqSession: 1389 reqAuthzID: uid=zimbra,cn=admins,cn=zimbra reqDN: uid=user.345,ou=people,dc=zre-ldap007,dc=eng,dc=zimbra,dc=com reqResult: 0 reqMod: entryCSN:= 20160909192940.809794Z#000000#001#000000 reqMod: modifiersName:= uid=zimbra,cn=admins,cn=zimbra reqMod: modifyTimestamp:= 20160909192940Z reqNewRDN: uid=renamed-user.345 reqDeleteOldRDN: TRUE reqEntryUUID: c63b246a-0a20-1036-82cb-a772509d7f53 entryUUID: 7f740e6c-0b0f-1036-8c00-b1b4f08a59fc creatorsName: cn=config createTimestamp: 20160909192940Z entryCSN: 20160909192940.809794Z#000000#001#000000 modifiersName: cn=config modifyTimestamp: 20160909192940Z
Possibly a DN collision in writing the entry to the accesslog DB that causes the second OP to get lost. Likely the same root issue as ITS#8493.
--Quanah
--
Quanah Gibson-Mount