The problem is that when I do conflicting writes while replication is broken, after replication is restored, the entry ends up with different values for an employeeNumber but everything else the same.
My cluster has 3 nodes
When all is said and done, the entry looks like the following on two nodes:
dn: cn=fakeentryfortesting,ou=people,dc=temple,dc=edu
objectClass: inetOrgPerson
cn: fakeentryfortesting
sn: fakeentryfortesting
givenName: fakeentryfortesting
carLicense: 123-XYZ
description: take two from 16
description: take two from 17
description: written on 16
description: written on 17
employeeNumber: from16
employeeType: deadwood
entryCSN: 20200124200603.270449Z#000000#075#000000
entryDN: cn=fakeentryfortesting,ou=people,dc=temple,dc=edu
entryUUID: 737fddce-d330-1039-89fc-4599653df8c5
hasSubordinates: FALSE
modifiersName: cn=admin,dc=temple,dc=edu
modifyTimestamp: 20200124200603Z
createTimestamp: 20200124200419Z
creatorsName: cn=admin,dc=temple,dc=edu
structuralObjectClass: inetOrgPerson
subschemaSubentry: cn=Subschema
BUT, on one node, it has
employeeNumber: from17
as the only difference.
>From an earlier post, this is not expected behavior. I'm including some info here in hopes that it's enough to get a workaround. If not, I can provide whatever debugging info is needed.
This is a source compiled slapd 2.4.48 running on RHEL 7.7 with a preloaded libtcmalloc.so.4. I experienced the same behavior with 2.4.45.
Some gory details...
--------------
Overview
--------------
I initially set up a single node as standalone. I then added a small ldif to give a tree to replicate. Then, I made the node do delta multimaster. Details of this are further down.
I wrote an entry and verified that it got the correct Server ID in the contextCSN.
I did
ldapsearch -H ldapi:// -Y external -LLL -b cn=config >config.ldif
to save the resulting configuration. Then I did a slapadd of that to another fresh node with empty slapd.d and database folders. It synced the data as expected. I repeated on another node.
So, I have 3 nodes with identical configurations, call them 16, 17 and 18. Replication seems to be working as expected.
slapd is run with:
-h ldapi:/// ldaps://pre-ldapr16.temple.edu
(or pre-ldapr17 or 18, depending on the node). I verified many, many times that these are identical to what's in the olcServerID, and identical to the provider in the olcSyncrepl attribute.
I'm the only one accessing these nodes right now, and certainly the only one using this test DN.
-------------
TEST SCENARIO
-------------
My test for this mostly follows test063-delta-multimaster.
Below, $TESTDN is cn=fakeentryfortesting,ou=people,dc=temple,dc=edu.
It seeds a test entry and makes sure it is the same on each node:
dn: $TESTDN
objectClass: inetorgperson
cn: fakeentryfortesting
sn: fakeentryfortesting
givenName: $TESTCN
description: test
It breaks replication on each node by replacing the password in the olcSyncrepl values with a wrong password.
Below, $X is 16, 17 or 18, depending on the node.
For each of the three nodes, it applies:
dn: $TESTDN
changetype: modify
add: description
description: written on $X
Then for each node it applies:
dn: $TESTDN
changetype: modify
delete: description
description: test
-
add: description
description: take two from $X
Then on 16, it applies:
dn: $TESTDN
changetype: modify
add: carLicense
carLicense: 123-XYZ
-
add: employeeNumber
employeeNumber: from$X
and on a different node it applies:
dn: $TESTDN
changetype: modify
add: employeeType
employeeType: deadwood
-
add: employeeNumber
employeeNumber: from$X
It then restores replication and checks the entry on the different nodes.
------------------
INITIAL STANDALONE CONFIG
------------------
dn: cn=config
objectClass: olcGlobal
cn: config
olcArgsFile: /var/run/openldap/slapd.args
olcAuthzPolicy: to
olcAuthzRegexp: {0}uid=ldap/([^,]*),cn=PREKDC.TEMPLE.EDU,cn=gssapi,cn=auth $1
olcConnMaxPending: 101
olcLogLevel: stats stats2
olcPidFile: /var/run/openldap/slapd.pid
olcReadOnly: FALSE
olcSecurity: simple_bind=128
olcSecurity: tls=0
olcSecurity: ssf=71
olcSockbufMaxIncoming: 22143
olcSockbufMaxIncomingAuth: 16777215
olcThreads: 8
olcTLSCACertificateFile: /etc/ssl/certs/intermediate.crt
olcTLSCACertificatePath: /etc/openldap/certs
olcTLSCertificateFile: /etc/ssl/certs/ldap.pem
olcTLSCertificateKeyFile: /etc/ssl/private/ldap.key.pem
olcTLSCipherSuite: ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+
AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS:!RC4:!3DES:!eNULL:!PSK
olcTLSDHParamFile: /etc/ssl/private/dhparam.bin
olcTLSProtocolMin: 3.2
olcToolThreads: 2
dn: cn=module{0},cn=config
objectClass: olcModuleList
cn: module{0}
olcModulePath: /usr/local/libexec/openldap
olcModuleLoad: {0}back_mdb
olcModuleLoad: {1}auditlog
olcModuleLoad: {2}unique
dn: cn=schema,cn=config
# ... entry and subtree omitted for brevity
dn: olcDatabase={-1}frontend,cn=config
objectClass: olcDatabaseConfig
objectClass: olcFrontendConfig
olcDatabase: {-1}frontend
olcAccess: {0} to * by dn.exact=gidNumber=0+uidNumber=0,cn=peercred,cn=externa
l,cn=auth manage by dn.base="cn=replicator,dc=temple,dc=edu" manage by * brea
k
olcAccess: {1}to dn.base="cn=Subschema" by * read
olcAccess: {2}to dn.exact="" by * read
olcLastMod: TRUE
olcReadOnly: FALSE
olcSizeLimit: size.soft=2000 size.hard=1000000
olcTimeLimit: time.soft=300 time.hard=3600
olcPasswordHash: {SSHA}
olcSortVals: hasMember
dn: olcDatabase={0}config,cn=config
objectClass: olcDatabaseConfig
olcDatabase: {0}config
olcAccess: {0}to * by dn.exact=gidNumber=0+uidNumber=0,cn=peercred,cn=external
,cn=auth manage by dn.base="cn=replicator,dc=temple,dc=edu" read by dn.exact=
"cn=admin,dc=temple,dc=edu" manage by * break
olcAccess: {1}to dn.base="cn=Subschema" by * read
olcAccess: {2}to dn.exact="" by * read
olcAccess: {3}to * by * none
olcAddContentAcl: TRUE
olcLastMod: TRUE
olcRootDN: cn=admin,cn=config
olcRootPW: {SSHA}..........
olcMonitoring: FALSE
dn: olcDatabase={1}mdb,cn=config
objectClass: olcDatabaseConfig
objectClass: olcMdbConfig
olcDatabase: {1}mdb
olcDbDirectory: /usr/local/var/lib/ldap/rootdb
olcSuffix: dc=temple,dc=edu
olcAccess: {0} to dn.exact="cn=Subschema" by * read
olcAccess: {1} to * by dn.exact=gidNumber=0+uidNumber=0,cn=peercred,cn=externa
l,cn=auth manage by dn.base="cn=replicator,dc=temple,dc=edu" manage by * brea
k
olcAccess: {2} to attrs=userPassword by * auth
olcAccess: {3} to dn.exact="" by * read
olcLastMod: TRUE
olcLimits: {0}dn.base="cn=replicator,dc=temple,dc=edu" size=unlimited time=unl
imited
olcRootDN: cn=admin,dc=temple,dc=edu
olcRootPW: {SSHA}.................
olcMonitoring: TRUE
olcDbEnvFlags: nosync
olcDbIndex: default eq
olcDbIndex: uid pres,eq,sub
olcDbIndex: uidNumber,gidNumber eq
olcDbIndex: member,memberUid eq
olcDbIndex: templeEduTUID eq,sub
olcDbIndex: templeEduTUNIC eq
# other indexes omitted
olcDbIndex: cn,sn pres,eq,approx,sub
olcDbIndex: mail pres,eq,sub
olcDbIndex: objectClass pres,eq
olcDbIndex: loginShell pres,eq
olcDbIndex: entryCSN,entryUUID
olcDbIndex: givenName,o,name,ou,displayName,eduPersonNicknameeq,sub
olcDbIndex: hasMember eq
olcDbIndex: mailAlternateAddress eq,sub
olcDbIndex: manager eq
olcDbIndex: owner eq
olcDbIndex: uniqueMember eq
olcDbMaxSize: 17179869184
dn: olcOverlay={0}auditlog,olcDatabase={1}mdb,cn=config
objectClass: olcOverlayConfig
objectClass: olcAuditLogConfig
olcOverlay: {0}auditlog
olcAuditlogFile: /var/log/ldap/audit.ldif
dn: olcOverlay={1}unique,olcDatabase={1}mdb,cn=config
objectClass: olcOverlayConfig
objectClass: olcUniqueConfig
olcOverlay: {1}unique
olcUniqueAttribute: templeEduTUNA
olcUniqueAttribute: templeEduTUid
--------------
INITIAL DATA
--------------
dn: dc=temple,dc=edu
objectClass: dcObject
objectClass: organization
o: temple.edu
dc: temple
dn: cn=replicator,dc=temple,dc=edu
objectClass: simpleSecurityObject
objectClass: organizationalRole
cn: replicator
description: LDAP replicator
userPassword: {SSHA}....
dn: ou=people,dc=temple,dc=edu
objectClass: top
objectClass: organizationalUnit
ou: people
------------------
MODIFICATION TO MAKE IT DELTA MMR
------------------
dn: cn=module{0},cn=config
changetype: modify
add: olcModuleLoad
olcModuleLoad: syncprov
olcModuleLoad: accesslog
dn: cn=config
changetype: modify
replace: olcServerID
olcServerID: 116 ldaps://pre-ldapr16.temple.edu
olcServerID: 117 ldaps://pre-ldapr17.temple.edu
olcServerID: 118 ldaps://pre-ldapr18.temple.edu
dn: olcDatabase={2}mdb,cn=config
changetype: add
objectClass: olcDatabaseConfig
objectClass: olcMdbConfig
olcDatabase: {2}mdb
olcDbDirectory: /usr/local/var/lib/ldap/accesslog
olcSuffix: cn=accesslog
olcAccess: {0}to dn.subtree="cn=accesslog"
by dn.exact="cn=replicator,dc=temple,dc=edu" read
by dn.exact="gidNumber=0+uidNumber=0,cn=peercred,cn=external,cn=auth" read
olcLastMod: TRUE
olcMaxDerefDepth: 15
olcReadOnly: FALSE
olcRootDN: cn=config
olcLimits: dn.exact="cn=replicator,dc=temple,dc=edu"
time=unlimited size=unlimited
olcSizeLimit: unlimited
olcTimeLimit: unlimited
olcMonitoring: TRUE
olcDbCheckpoint: 0 0
olcDbIndex: entryCSN,objectClass,reqEnd,reqResult,reqStart,reqDN,entryUUID eq
olcDbMode: 0600
olcDbSearchStack: 16
olcDbMaxsize: 85899345920
dn: olcOverlay={0}syncprov,olcDatabase={2}mdb,cn=config
changetype: add
objectClass: olcOverlayConfig
objectClass: olcSyncProvConfig
olcOverlay: {0}syncprov
olcSpNoPresent: TRUE
olcSpReloadHint: TRUE
dn: olcOverlay={0}syncprov,olcDatabase={1}mdb,cn=config
changetype: add
objectClass: olcOverlayConfig
objectClass: olcSyncProvConfig
olcOverlay: {0}syncprov
olcSpCheckpoint: 20 10
olcSpSessionlog: 10000000
olcSpReloadHint: TRUE
dn: olcOverlay={1}accesslog,olcDatabase={1}mdb,cn=config
changetype: add
objectClass: olcOverlayConfig
objectClass: olcAccessLogConfig
olcOverlay: {1}accesslog
olcAccessLogDB: cn=accesslog
olcAccessLogOps: writes
olcAccessLogSuccess: TRUE
olcAccessLogPurge: 01+00:00 00+04:00
dn: olcDatabase={1}mdb,cn=config
changetype: modify
add: olcSyncrepl
olcSyncrepl: {0}rid=306 provider=ldaps://pre-ldapr16.temple.edu
bindmethod=simple
binddn="cn=replicator,dc=temple,dc=edu" credentials=password
keepalive=0:5:0 tls_reqcert=allow searchbase="dc=temple,dc=edu"
scope=sub schemachecking=on type=refreshAndPersist retry="5 5 5 +"
logfilter="(&(objectClass=auditWriteObject)(reqResult=0))"
logbase=cn=accesslog syncdata=accesslog
olcSyncrepl: {1}rid=307 provider=ldaps://pre-ldapr17.temple.edu
bindmethod=simple
binddn="cn=replicator,dc=temple,dc=edu" credentials=password
keepalive=0:5:0 tls_reqcert=allow searchbase="dc=temple,dc=edu"
scope=sub schemachecking=on type=refreshAndPersist retry="5 5 5 +"
logfilter="(&(objectClass=auditWriteObject)(reqResult=0))"
logbase=cn=accesslog syncdata=accesslog
olcSyncrepl: {2}rid=308 provider=ldaps://pre-ldapr18.temple.edu
bindmethod=simple
binddn="cn=replicator,dc=temple,dc=edu" credentials=password
keepalive=0:5:0 tls_reqcert=allow searchbase="dc=temple,dc=edu"
scope=sub schemachecking=on type=refreshAndPersist retry="5 5 5 +"
logfilter="(&(objectClass=auditWriteObject)(reqResult=0))"
logbase=cn=accesslog syncdata=accesslog
-
replace: olcMirrorMode
olcMirrorMode: TRUE
You really see this far down? Awesome! Maybe there's somethig that jumped out at you as wrong with this?
Thanks for any help,
Zach