Apologies if this is not the correct forum for this problem, however I'm having replication issues with a two-node multi-provider setup which I'm trying to diagnose and any help would be appreciated (if there is a replication troubleshooting guide available that I can follow to diagnose these issues myself, I'm more than happy to work through that, please let me know)
I have the following: two ubuntu 18.04 hosts (alpha, bravo); each has 2.4.45+dfsg-1ubuntu1.4; ldaps:// is configured and visible from the other; queries by clients work fine; time is in sync between them; the memberof overlay is present and appears to work; the active configuration is in olc (cn=config) form
Summary:
The replication issue manifests itself this way:
* ldapmodify of a user record attribute on bravo replicates ok to alpha, but the user record there loses the memberof attributes * ldapmodify of a user record attribute on alpha never replicates to bravo
I can "repair" the memberof attribute for a user across both hosts by removing them from a group and then re-instating them - these ldapmodify changes must occur to bravo however
Other observations are that the root node on each server seems to have different entryuuid attributes but the contextcsn from each server is present and is replicated (as shown at the bottom of this email) ...
I'm also wondering whether I might have the layering of these overlays wrong - I have them in order {0}back_mdb, {1}memberof, {2}refint, {3}syncprov ... should those instead be {0}back_mdb, {1}syncprov, {2}memberof, {3}refint?
Background:
This setup was initially just alpha with a nis/rfc2307 schema, but I copied out all of that data, converted it to rfc2307bis, configured bravo from scratch with an rfc2307bis schema, imported my converted data and successfully migrated my clients to use bravo.
Once the clients (or as many of them as I could) were looking to bravo, I dropped all the data out of alpha and then configured it from scratch to match that on bravo. Unfortunately I know I got bits of this part of the process wrong and this may have soured my result:
* olcServerID was missing on both hosts * initially olcSyncrepl was explicitly excluding memberof from replication with 'exattr="memberof"' * because I couldn't get replication working, I eventually manually added the same data to alpha as I'd used to populate bravo * at some point replication *did* kind of work and my users immediately lost memberof across the board
Since then, I've gradually remediated the config and data across both hosts, but replication still refuses to cooperate (as summarised above). This is the bits of the config that appear to be relevant, and as far as I can tell, they look OK to me:
alpha (slightly edited for brevity) - bravo differs only in the value for olcServerID, olcRootPW hash and olcSyncrepl peer uri:
dn: cn=config cn: config objectClass: olcGlobal olcArgsFile: /var/run/slapd/slapd.args olcLogLevel: none olcPidFile: /var/run/slapd/slapd.pid olcServerID: 2 olcTLSCACertificateFile: /etc/ldap/sasl2/ca-certificates.crt olcTLSCertificateFile: /etc/ldap/sasl2/server.crt olcTLSCertificateKeyFile: /etc/ldap/sasl2/server.key olcTLSVerifyClient: never olcToolThreads: 1
dn: cn=module{0},cn=config cn: module{0} objectClass: olcModuleList olcModuleLoad: {0}back_mdb olcModuleLoad: {1}memberof olcModuleLoad: {2}refint olcModuleLoad: {3}syncprov olcModulePath: /usr/lib/ldap
: (omitting pages of schema definitions) :
dn: olcDatabase={1}mdb,cn=config objectClass: olcDatabaseConfig objectClass: olcMdbConfig olcAccess: {0}to attrs=userPassword by self write by anonymous auth by * none olcAccess: {1}to attrs=shadowLastChange by self write by * read olcAccess: {2}to * by * read olcAccess: {3}to * by dn.base="cn=admin,dc=domain,dc=com" read by * break olcDatabase: {1}mdb olcDbCheckpoint: 512 30 olcDbDirectory: /var/lib/ldap olcDbIndex: cn,uid eq olcDbIndex: member,memberUid eq olcDbIndex: objectClass eq olcDbIndex: uidNumber,gidNumber eq olcDbMaxSize: 1073741824 olcLastMod: TRUE olcMirrorMode: TRUE olcRootDN: cn=admin,dc=domain,dc=com olcRootPW:: yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy olcSuffix: dc=domain,dc=com olcSyncrepl: {0}rid=000 provider=ldaps://bravo.domain.com:636 type=refreshAndPersist retry="5 5 300 +" searchbase="dc=domain,dc=com" attrs="*,+" bindmethod=simple binddn="cn=admin,dc=domain,dc=com" credentials=xxxxxxxxxxxx
dn: olcOverlay={0}memberof,olcDatabase={1}mdb,cn=config objectClass: olcConfig objectClass: olcMemberOf objectClass: olcOverlayConfig objectClass: top olcMemberOfDangling: ignore olcMemberOfGroupOC: groupOfNames olcMemberOfMemberAD: member olcMemberOfMemberOfAD: memberOf olcMemberOfRefInt: TRUE olcOverlay: {0}memberof structuralObjectClass: olcMemberOf
dn: olcOverlay={1}refint,olcDatabase={1}mdb,cn=config objectClass: olcConfig objectClass: olcOverlayConfig objectClass: olcRefintConfig objectClass: top olcOverlay: {1}refint olcRefintAttribute: memberof member manager owner structuralObjectClass: olcRefintConfig
dn: olcOverlay={2}syncprov,olcDatabase={1}mdb,cn=config objectClass: olcSyncProvConfig olcOverlay: {2}syncprov olcSpCheckpoint: 100 10 structuralObjectClass: olcSyncProvConfig
The root dc=domain,dc=com entry on both hosts, for comparison:
# alpha dn: dc=domain,dc=com contextCSN: 20200504085236.014362Z#000000#002#000000 contextCSN: 20200504090305.772541Z#000000#001#000000 createTimestamp: 20200429092623Z creatorsName: cn=admin,dc=domain,dc=com dc: internal entryCSN: 20200429092623.624698Z#000000#000#000000 entryUUID: 3cd3ddfe-1e47-103a-9b7c-636addc89a1e modifiersName: cn=admin,dc=domain,dc=com modifyTimestamp: 20200429092623Z o: domain.com objectClass: dcObject objectClass: organization objectClass: top structuralObjectClass: organization
# bravo dn: dc=domain,dc=com contextCSN: 20200504085236.014362Z#000000#002#000000 contextCSN: 20200504090303.909881Z#000000#001#000000 createTimestamp: 20200406074258Z creatorsName: cn=admin,dc=domain,dc=com dc: internal entryCSN: 20200406074258.475068Z#000000#000#000000 entryUUID: fac4ec88-0c25-103a-80b5-470686e83bfd modifiersName: cn=admin,dc=domain,dc=com modifyTimestamp: 20200406074258Z o: domain.com objectClass: dcObject objectClass: organization objectClass: top structuralObjectClass: organization
Regards, Malcolm
--On Wednesday, May 6, 2020 5:27 PM +1000 Malcolm Herbert openldap.org@mjch.net wrote:
Apologies if this is not the correct forum for this problem, however I'm having replication issues with a two-node multi-provider setup which I'm trying to diagnose and any help would be appreciated (if there is a replication troubleshooting guide available that I can follow to diagnose these issues myself, I'm more than happy to work through that, please let me know)
Hello,
a) Numerous replication issues have been fixed since 2.4.45. Please use a current release.
b) use delta-syncrepl instead of standard syncrepl.
c) memberOf is not replication safe at this time. This is documented in the man page for slapo-memberOf for current releases.
You may wish to read over https://www.openldap.org/software/release/changes.html as well.
Regards, Quanah
--
Quanah Gibson-Mount Product Architect Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: http://www.symas.com
openldap-technical@openldap.org