I am seeing inconsistent values for an attribute after conflicting writes. I mostly followed https://mishikal.wordpress.com/2019/04/23/configuring-mmr-using-delta-syncre... for the setup, and mostly followed test063-delta-multimaster from the source code for the test scenario.
The problem is that when I do conflicting writes while replication is broken, after replication is restored, the entry ends up with different values for an employeeNumber but everything else the same.
My cluster has 3 nodes When all is said and done, the entry looks like the following on two nodes:
dn: cn=fakeentryfortesting,ou=people,dc=temple,dc=edu objectClass: inetOrgPerson cn: fakeentryfortesting sn: fakeentryfortesting givenName: fakeentryfortesting carLicense: 123-XYZ description: take two from 16 description: take two from 17 description: written on 16 description: written on 17 employeeNumber: from16 employeeType: deadwood entryCSN: 20200124200603.270449Z#000000#075#000000 entryDN: cn=fakeentryfortesting,ou=people,dc=temple,dc=edu entryUUID: 737fddce-d330-1039-89fc-4599653df8c5 hasSubordinates: FALSE modifiersName: cn=admin,dc=temple,dc=edu modifyTimestamp: 20200124200603Z createTimestamp: 20200124200419Z creatorsName: cn=admin,dc=temple,dc=edu structuralObjectClass: inetOrgPerson subschemaSubentry: cn=Subschema
BUT, on one node, it has employeeNumber: from17 as the only difference.
From an earlier post, this is not expected behavior. I'm including some info here in hopes that it's enough to get a workaround. If not, I can provide whatever debugging info is needed.
This is a source compiled slapd 2.4.48 running on RHEL 7.7 with a preloaded libtcmalloc.so.4. I experienced the same behavior with 2.4.45.
Some gory details...
-------------- Overview --------------
I initially set up a single node as standalone. I then added a small ldif to give a tree to replicate. Then, I made the node do delta multimaster. Details of this are further down.
I wrote an entry and verified that it got the correct Server ID in the contextCSN.
I did ldapsearch -H ldapi:// -Y external -LLL -b cn=config >config.ldif to save the resulting configuration. Then I did a slapadd of that to another fresh node with empty slapd.d and database folders. It synced the data as expected. I repeated on another node.
So, I have 3 nodes with identical configurations, call them 16, 17 and 18. Replication seems to be working as expected.
slapd is run with: -h ldapi:/// ldaps://pre-ldapr16.temple.edu (or pre-ldapr17 or 18, depending on the node). I verified many, many times that these are identical to what's in the olcServerID, and identical to the provider in the olcSyncrepl attribute.
I'm the only one accessing these nodes right now, and certainly the only one using this test DN.
------------- TEST SCENARIO -------------
My test for this mostly follows test063-delta-multimaster.
Below, $TESTDN is cn=fakeentryfortesting,ou=people,dc=temple,dc=edu.
It seeds a test entry and makes sure it is the same on each node: dn: $TESTDN objectClass: inetorgperson cn: fakeentryfortesting sn: fakeentryfortesting givenName: $TESTCN description: test
It breaks replication on each node by replacing the password in the olcSyncrepl values with a wrong password.
Below, $X is 16, 17 or 18, depending on the node.
For each of the three nodes, it applies: dn: $TESTDN changetype: modify add: description description: written on $X
Then for each node it applies: dn: $TESTDN changetype: modify delete: description description: test - add: description description: take two from $X
Then on 16, it applies: dn: $TESTDN changetype: modify add: carLicense carLicense: 123-XYZ - add: employeeNumber employeeNumber: from$X
and on a different node it applies: dn: $TESTDN changetype: modify add: employeeType employeeType: deadwood - add: employeeNumber employeeNumber: from$X
It then restores replication and checks the entry on the different nodes.
------------------ INITIAL STANDALONE CONFIG ------------------
dn: cn=config objectClass: olcGlobal cn: config olcArgsFile: /var/run/openldap/slapd.args olcAuthzPolicy: to olcAuthzRegexp: {0}uid=ldap/([^,]*),cn=PREKDC.TEMPLE.EDU,cn=gssapi,cn=auth $1 olcConnMaxPending: 101 olcLogLevel: stats stats2 olcPidFile: /var/run/openldap/slapd.pid olcReadOnly: FALSE olcSecurity: simple_bind=128 olcSecurity: tls=0 olcSecurity: ssf=71 olcSockbufMaxIncoming: 22143 olcSockbufMaxIncomingAuth: 16777215 olcThreads: 8 olcTLSCACertificateFile: /etc/ssl/certs/intermediate.crt olcTLSCACertificatePath: /etc/openldap/certs olcTLSCertificateFile: /etc/ssl/certs/ldap.pem olcTLSCertificateKeyFile: /etc/ssl/private/ldap.key.pem olcTLSCipherSuite: ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+ AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS:!RC4:!3DES:!eNULL:!PSK olcTLSDHParamFile: /etc/ssl/private/dhparam.bin olcTLSProtocolMin: 3.2 olcToolThreads: 2
dn: cn=module{0},cn=config objectClass: olcModuleList cn: module{0} olcModulePath: /usr/local/libexec/openldap olcModuleLoad: {0}back_mdb olcModuleLoad: {1}auditlog olcModuleLoad: {2}unique
dn: cn=schema,cn=config # ... entry and subtree omitted for brevity
dn: olcDatabase={-1}frontend,cn=config objectClass: olcDatabaseConfig objectClass: olcFrontendConfig olcDatabase: {-1}frontend olcAccess: {0} to * by dn.exact=gidNumber=0+uidNumber=0,cn=peercred,cn=externa l,cn=auth manage by dn.base="cn=replicator,dc=temple,dc=edu" manage by * brea k olcAccess: {1}to dn.base="cn=Subschema" by * read olcAccess: {2}to dn.exact="" by * read olcLastMod: TRUE olcReadOnly: FALSE olcSizeLimit: size.soft=2000 size.hard=1000000 olcTimeLimit: time.soft=300 time.hard=3600 olcPasswordHash: {SSHA} olcSortVals: hasMember
dn: olcDatabase={0}config,cn=config objectClass: olcDatabaseConfig olcDatabase: {0}config olcAccess: {0}to * by dn.exact=gidNumber=0+uidNumber=0,cn=peercred,cn=external ,cn=auth manage by dn.base="cn=replicator,dc=temple,dc=edu" read by dn.exact= "cn=admin,dc=temple,dc=edu" manage by * break olcAccess: {1}to dn.base="cn=Subschema" by * read olcAccess: {2}to dn.exact="" by * read olcAccess: {3}to * by * none olcAddContentAcl: TRUE olcLastMod: TRUE olcRootDN: cn=admin,cn=config olcRootPW: {SSHA}.......... olcMonitoring: FALSE
dn: olcDatabase={1}mdb,cn=config objectClass: olcDatabaseConfig objectClass: olcMdbConfig olcDatabase: {1}mdb olcDbDirectory: /usr/local/var/lib/ldap/rootdb olcSuffix: dc=temple,dc=edu olcAccess: {0} to dn.exact="cn=Subschema" by * read olcAccess: {1} to * by dn.exact=gidNumber=0+uidNumber=0,cn=peercred,cn=externa l,cn=auth manage by dn.base="cn=replicator,dc=temple,dc=edu" manage by * brea k olcAccess: {2} to attrs=userPassword by * auth olcAccess: {3} to dn.exact="" by * read olcLastMod: TRUE olcLimits: {0}dn.base="cn=replicator,dc=temple,dc=edu" size=unlimited time=unl imited olcRootDN: cn=admin,dc=temple,dc=edu olcRootPW: {SSHA}................. olcMonitoring: TRUE olcDbEnvFlags: nosync olcDbIndex: default eq olcDbIndex: uid pres,eq,sub olcDbIndex: uidNumber,gidNumber eq olcDbIndex: member,memberUid eq olcDbIndex: templeEduTUID eq,sub olcDbIndex: templeEduTUNIC eq # other indexes omitted olcDbIndex: cn,sn pres,eq,approx,sub olcDbIndex: mail pres,eq,sub olcDbIndex: objectClass pres,eq olcDbIndex: loginShell pres,eq olcDbIndex: entryCSN,entryUUID olcDbIndex: givenName,o,name,ou,displayName,eduPersonNicknameeq,sub olcDbIndex: hasMember eq olcDbIndex: mailAlternateAddress eq,sub olcDbIndex: manager eq olcDbIndex: owner eq olcDbIndex: uniqueMember eq olcDbMaxSize: 17179869184
dn: olcOverlay={0}auditlog,olcDatabase={1}mdb,cn=config objectClass: olcOverlayConfig objectClass: olcAuditLogConfig olcOverlay: {0}auditlog olcAuditlogFile: /var/log/ldap/audit.ldif
dn: olcOverlay={1}unique,olcDatabase={1}mdb,cn=config objectClass: olcOverlayConfig objectClass: olcUniqueConfig olcOverlay: {1}unique olcUniqueAttribute: templeEduTUNA olcUniqueAttribute: templeEduTUid
-------------- INITIAL DATA --------------
dn: dc=temple,dc=edu objectClass: dcObject objectClass: organization o: temple.edu dc: temple
dn: cn=replicator,dc=temple,dc=edu objectClass: simpleSecurityObject objectClass: organizationalRole cn: replicator description: LDAP replicator userPassword: {SSHA}....
dn: ou=people,dc=temple,dc=edu objectClass: top objectClass: organizationalUnit ou: people
------------------ MODIFICATION TO MAKE IT DELTA MMR ------------------
dn: cn=module{0},cn=config changetype: modify add: olcModuleLoad olcModuleLoad: syncprov olcModuleLoad: accesslog
dn: cn=config changetype: modify replace: olcServerID olcServerID: 116 ldaps://pre-ldapr16.temple.edu olcServerID: 117 ldaps://pre-ldapr17.temple.edu olcServerID: 118 ldaps://pre-ldapr18.temple.edu
dn: olcDatabase={2}mdb,cn=config changetype: add objectClass: olcDatabaseConfig objectClass: olcMdbConfig olcDatabase: {2}mdb olcDbDirectory: /usr/local/var/lib/ldap/accesslog olcSuffix: cn=accesslog olcAccess: {0}to dn.subtree="cn=accesslog" by dn.exact="cn=replicator,dc=temple,dc=edu" read by dn.exact="gidNumber=0+uidNumber=0,cn=peercred,cn=external,cn=auth" read olcLastMod: TRUE olcMaxDerefDepth: 15 olcReadOnly: FALSE olcRootDN: cn=config olcLimits: dn.exact="cn=replicator,dc=temple,dc=edu" time=unlimited size=unlimited olcSizeLimit: unlimited olcTimeLimit: unlimited olcMonitoring: TRUE olcDbCheckpoint: 0 0 olcDbIndex: entryCSN,objectClass,reqEnd,reqResult,reqStart,reqDN,entryUUID eq olcDbMode: 0600 olcDbSearchStack: 16 olcDbMaxsize: 85899345920
dn: olcOverlay={0}syncprov,olcDatabase={2}mdb,cn=config changetype: add objectClass: olcOverlayConfig objectClass: olcSyncProvConfig olcOverlay: {0}syncprov olcSpNoPresent: TRUE olcSpReloadHint: TRUE
dn: olcOverlay={0}syncprov,olcDatabase={1}mdb,cn=config changetype: add objectClass: olcOverlayConfig objectClass: olcSyncProvConfig olcOverlay: {0}syncprov olcSpCheckpoint: 20 10 olcSpSessionlog: 10000000 olcSpReloadHint: TRUE
dn: olcOverlay={1}accesslog,olcDatabase={1}mdb,cn=config changetype: add objectClass: olcOverlayConfig objectClass: olcAccessLogConfig olcOverlay: {1}accesslog olcAccessLogDB: cn=accesslog olcAccessLogOps: writes olcAccessLogSuccess: TRUE olcAccessLogPurge: 01+00:00 00+04:00
dn: olcDatabase={1}mdb,cn=config changetype: modify add: olcSyncrepl olcSyncrepl: {0}rid=306 provider=ldaps://pre-ldapr16.temple.edu bindmethod=simple binddn="cn=replicator,dc=temple,dc=edu" credentials=password keepalive=0:5:0 tls_reqcert=allow searchbase="dc=temple,dc=edu" scope=sub schemachecking=on type=refreshAndPersist retry="5 5 5 +" logfilter="(&(objectClass=auditWriteObject)(reqResult=0))" logbase=cn=accesslog syncdata=accesslog olcSyncrepl: {1}rid=307 provider=ldaps://pre-ldapr17.temple.edu bindmethod=simple binddn="cn=replicator,dc=temple,dc=edu" credentials=password keepalive=0:5:0 tls_reqcert=allow searchbase="dc=temple,dc=edu" scope=sub schemachecking=on type=refreshAndPersist retry="5 5 5 +" logfilter="(&(objectClass=auditWriteObject)(reqResult=0))" logbase=cn=accesslog syncdata=accesslog olcSyncrepl: {2}rid=308 provider=ldaps://pre-ldapr18.temple.edu bindmethod=simple binddn="cn=replicator,dc=temple,dc=edu" credentials=password keepalive=0:5:0 tls_reqcert=allow searchbase="dc=temple,dc=edu" scope=sub schemachecking=on type=refreshAndPersist retry="5 5 5 +" logfilter="(&(objectClass=auditWriteObject)(reqResult=0))" logbase=cn=accesslog syncdata=accesslog - replace: olcMirrorMode olcMirrorMode: TRUE
You really see this far down? Awesome! Maybe there's somethig that jumped out at you as wrong with this?
Thanks for any help, Zach
--On Friday, January 24, 2020 9:45 PM +0000 Zach Hanson Hart zach@temple.edu wrote:
I am seeing inconsistent values for an attribute after conflicting writes. I mostly followed https://mishikal.wordpress.com/2019/04/23/configuring-mmr-using-delta-syn crepl-in-openldap-updating-an-existing-standalone-configuration/ for the setup, and mostly followed test063-delta-multimaster from the source code for the test scenario.
I would suggest modifying one of the test or regression scripts to reproduce this error and submit an ITS along with the script.
Regards, Quanah
--
Quanah Gibson-Mount Product Architect Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: http://www.symas.com
openldap-technical@openldap.org