Hi Quanah,
On 05/06/2019 19:09, Quanah Gibson-Mount wrote:
--On Wednesday, June 05, 2019 11:38 AM +0100 Mark Cairney Mark.Cairney@ed.ac.uk wrote:
Hi,
We currently run a multi-master setup where all 4 of our servers replicate from each other via delta-syncrepl but all our writes are directed at a selected "master" server.
I've noticed recently that we are suffering sync issues and this coincides with us upgrading from 2.4.46 with a patch for ITS8843 to 2.4.47.
Hi Mark,
Error 0x14 would be attribute type or value already exists. This would indicate a fundamental problem of there being discrepencies between your servers. I.e., your databases are out of sync.
That smells a little bit like NTP sync but these boxes are all set to aggressively poll a single NTP server:
server 129.215.205.191 minpoll 4 maxpoll 6 restrict 129.215.205.191 nomodify notrap
There is a (possibly self-inflicted) point of pain where we have an exattrs=memberOf in our syncrepl config to work around another replication issue so this means that any user objects which are REFRESH'ed end up losing all their group memberships.
You are aware the slapo-memberof(5) man page specifically states it is not compatible with syncrepl based replication, in particular because of the way in which the REFRESH phase functions combined with user/group entry location (creation time) in the database?
I would add that the exattrs line shouldn't be necessary with a proper configuration. I'm not sure what issue you were trying to avoid with this setting. The ITS#8444 regression test in the testsuite specifically has a 4-way MMR setup with memberof, and requires no such setting nor does it exhibit the issues you mention.
I.e., the only way you can ensure slapo-memberof will be "ok" in an replicated environment (syncrepl or delta-syncrepl) is if you can guarantee a REFRESH will never occur.
I think this was done in part to mitigate the behaviour in that ITS. If memberOf isn't compatible with syncrepl then does that mean you can't have replication and use the memberOf overlay? This would pose a problem for many medium-large installations surely? Looking back through the mailing list archive I do note a discussion of the whys and wherefores in September 2018.
The man page suggests using the dynlist overlay instead but given my limited understanding I don't see how that would work given the most common use case of the memberOf overlay is to get the group memberships when the user object is queried e.g. ldapsearch "uid=mcairney" "*" memberof
In general though it does seem like replication and attributes maintained locally via overlays (memberOf, ppolicy etc) are an absolute minefield!
As for the configuration of the server, see below.
Are there any known bugs/ regressions with delta-syncrepl in 2.4.47? One idea I've had is to go to a true single-master config by setting the current consumers to be read-only and having a single olcSyncrepl clause for the master on these 3 servers.
dn: olcOverlay={0}dynlist,olcDatabase={1}mdb,cn=config objectClass: olcOverlayConfig objectClass: olcDynamicList olcOverlay: {0}dynlist olcDlAttrSet: {0}groupOfURLs memberURL
dn: olcOverlay={1}memberof,olcDatabase={1}mdb,cn=config objectClass: olcOverlayConfig objectClass: olcMemberOf olcOverlay: {1}memberof olcMemberOfDangling: ignore olcMemberOfRefInt: TRUE olcMemberOfGroupOC: groupOfNames olcMemberOfMemberAD: member olcMemberOfMemberOfAD: memberOf
dn: olcOverlay={2}syncprov,olcDatabase={1}mdb,cn=config objectClass: olcOverlayConfig objectClass: olcConfig objectClass: top objectClass: olcSyncProvConfig olcOverlay: {2}syncprov
dn: olcOverlay={3}accesslog,olcDatabase={1}mdb,cn=config objectClass: olcOverlayConfig objectClass: olcAccessLogConfig olcOverlay: {3}accesslog olcAccessLogDB: cn=accesslog olcAccessLogOps: writes olcAccessLogPurge: 02+00:00 00+04:00 olcAccessLogSuccess: TRUE
As I've noted a number of times on the list, overlay instantiation order is important for operation interception/processing. The syncprov overlay should be the first instantiated overlay, followed by accesslog, in a delta-syncrepl setup. In the above, this is clearly not the case.
I thought I'd seen this mentioned but wasn't 100% sure. I've now re-ordered my overlays as follows:
dn: olcDatabase={1}mdb,cn=config
# {0}syncprov, {1}mdb, config dn: olcOverlay={0}syncprov,olcDatabase={1}mdb,cn=config
# {1}accesslog, {1}mdb, config dn: olcOverlay={1}accesslog,olcDatabase={1}mdb,cn=config
# {2}memberof, {1}mdb, config dn: olcOverlay={2}memberof,olcDatabase={1}mdb,cn=config
# {3}dynlist, {1}mdb, config dn: olcOverlay={3}dynlist,olcDatabase={1}mdb,cn=config
I would additionally note that the syncprov overlay is missing a sessionlog setting, where the default is likely much smaller than required for mitigating ITS#8125.
...and I've set olcSpSessionlog to be 500 (I'm not sure what the default is- 100?) Hopefully that's sufficient to avoid ITS#8125
Hope that helps!
It does! Many thanks,
Mark
--Quanah
--
Quanah Gibson-Mount Product Architect Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: http://www.symas.com