Hello All!
I have started building a large scale OpenLDAP infrastructure for the company I work for and Im running into one issue i cant seem to resolve.
The final architecture is 4 Master servers (each one in its own data center across the USA) We also want to have 2 slaves tied to each master. We deal with an incredibly high amount of authentication traffic.
I currently have the 4 masters configured (n-master - using syncrepl) that is functioning as designed, cn=config and user databases are successfully replicated across the 4 servers no matter which server you use to update.
What I am having issues with is getting the slaves to sync to the masters.
I have the rpuser set up, the machines can talk to each other. I can run queries using the rpuser from any slave to any master and get data back. I can see the rpuser connecting to the master, and showing successful authentication in an attempt to replicate back to the slave.
But this error comes up do_syncrep2: rid=010 got search entry without Sync State control
and user data is not replicated back to the slave.
some additional notes: On the slaves, i did NOT replicate the cn=config db {0) only the
here is the LDIF file (with hostnames/passwd removed)
dn: olcDatabase={1}mdb,cn=config changetype: modify add: olcSyncRepl olcSyncRepl: rid=010 provider=ldap://master-1.example.com:389/ bindmethod=simple binddn="uid=rpuser,dc=example,dc=com" credentials=banana searchbase="dc=example,dc=com" type=refreshAndPersist retry="30 5 300 3" interval=00:00:05:00
here is the applied config on the slave server
# {1}mdb, config dn: olcDatabase={1}mdb,cn=config objectClass: olcDatabaseConfig objectClass: olcMdbConfig olcDatabase: {1}mdb olcDbDirectory: /var/lib/ldap olcSuffix: dc=squaretrade,dc=com olcAccess: {0}to attrs=userPassword by self write by anonymous auth by * none olcAccess: {1}to attrs=shadowLastChange by self write by * read olcAccess: {2}to * by * read olcLastMod: TRUE olcRootDN: cn=admin,dc=example,dc=com olcSyncrepl: {0}rid=010 provider=ldap://master-1.example.com :389/ bindmethod=simple binddn="uid=rpuser,dc=example,dc=com" credentials =banana searchbase="dc=example,dc=com" type=refreshAndPersist retry="30 5 300 3" interval=00:00:05:00 olcDbCheckpoint: 512 30 olcDbIndex: objectClass eq olcDbIndex: cn,uid eq olcDbIndex: uidNumber,gidNumber eq olcDbIndex: member,memberUid eq olcDbMaxSize: 1073741824
here is the syncprov config on the master it is communicating with
# {0}syncprov, {1}mdb, config dn: olcOverlay={0}syncprov,olcDatabase={1}mdb,cn=config objectClass: olcOverlayConfig objectClass: olcSyncProvConfig olcOverlay: {0}syncprov
My questions 1> does the slave also require the cn=config database replication? 2> do the masters need similar configs (i.e. like the n-master config) does RID=010 also need to be configured on the master?
here is a section of logs from a sync attempt
Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: epoll: listen=9 active_threads=0 tvp=zero Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: epoll: listen=10 active_threads=0 tvp=zero Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: epoll: listen=11 active_threads=0 tvp=zero Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: =>do_syncrepl rid=010 Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: epoll: listen=12 active_threads=0 tvp=zero Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: epoll: listen=13 active_threads=0 tvp=zero Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: =>do_syncrep2 rid=010 Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: do_syncrep2: rid=010 got search entry without Sync State control (dc=example,dc=com) Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: do_syncrepl: rid=010 rc -1 retrying (1 retries left) Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: activity on 1 descriptor Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: activity on: Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: epoll: listen=9 active_threads=0 tvp=zero Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: epoll: listen=10 active_threads=0 tvp=zero Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: epoll: listen=11 active_threads=0 tvp=zero Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: epoll: listen=12 active_threads=0 tvp=zero Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: epoll: listen=13 active_threads=0 tvp=zero
any assistance would be greatly appreciated! and if there is additional information that will help just ask!
Christopher
--On Tuesday, April 24, 2018 10:34 AM -0600 Chris Cardone ccardone@squaretrade.com wrote:
Hi Chris,
dn: olcDatabase={1}mdb,cn=config changetype: modify add: olcSyncRepl olcSyncRepl: rid=010 provider=ldap://master-1.example.com:389/ bindmethod=simple binddn="uid=rpuser,dc=example,dc=com" credentials=banana searchbase="dc=example,dc=com" type=refreshAndPersist retry="30 5 300 3" interval=00:00:05:00
Are you really using dc=example,dc=com as the search base? Because your DB is configured for dc=squaretrade,dc=com.
# {1}mdb, config dn: olcDatabase={1}mdb,cn=config olcAccess: {0}to attrs=userPassword by self write by anonymous auth by * none
If this is the same as your ACL on the master, the replica will be unable to read userPassword changes. This will become problematic in the long run.
olcSyncrepl: {0}rid=010 provider=ldap://master-1.example.com :389/ bindmethod=simple binddn="uid=rpuser,dc=example,dc=com" credentials =banana searchbase="dc=example,dc=com" type=refreshAndPersist retry="30 5 300 3" interval=00:00:05:00
Same comment here about the searchbase being invalid.
olcDbCheckpoint: 512 30
I suggest reading the man page for slapd-mdb(5) and the checkpoint parameter (just so you're aware that one of those values provided is ignored).
olcDbIndex: objectClass eq olcDbIndex: cn,uid eq olcDbIndex: uidNumber,gidNumber eq olcDbIndex: member,memberUid eq olcDbMaxSize: 1073741824
You're missing the required indices for replication. Please read the documentation thoroughly.
here is the syncprov config on the master it is communicating with
# {0}syncprov, {1}mdb, config dn: olcOverlay={0}syncprov,olcDatabase={1}mdb,cn=config objectClass: olcOverlayConfig objectClass: olcSyncProvConfig olcOverlay: {0}syncprov
You're missing a few items, such as:
olcSpCheckpoint olcSpSessionlog
My questions
1> does the slave also require the cn=config database replication?
It shouldn't, no.
2> do the masters need similar configs (i.e. like the n-master config) does RID=010 also need to be configured on the master?
No. The documentation clearly states that RIDs are tracked internally per slapd. A given slapd has zero knowledge of what RID values are used on other servers, and doesn't require it.
Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: do_syncrep2: rid=010 got search entry without Sync State control (dc=example,dc=com)
This again shows you using the incorrect base. I believe this is the expected behavior when that is the case.
Warm regards, Quanah
--
Quanah Gibson-Mount Product Architect Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: http://www.symas.com
openldap-technical@openldap.org