Hello All!

I have started building a large scale OpenLDAP infrastructure for the company I work for and Im running into one issue i cant seem to resolve.

The final architecture is 4 Master servers (each one in its own data center across the USA)
We also want to have 2 slaves tied to each master.  We deal with an incredibly high amount of authentication traffic.

I currently have the 4 masters configured (n-master - using syncrepl) that is functioning as designed, cn=config and user databases are successfully replicated across the 4 servers no matter which server you use to update.

What I am having issues with is getting the slaves to sync to the masters. 

I have the rpuser set up, the machines can talk to each other. I can run queries using the rpuser from any slave to any master and get data back.  I can see the rpuser connecting to the master, and showing successful authentication in an attempt to replicate back to the slave.

But this error comes up
 do_syncrep2: rid=010 got search entry without Sync State control

and user data is not replicated back to the slave.

some additional notes:
On the slaves, i did NOT replicate the cn=config db {0) only the

here is the LDIF file (with hostnames/passwd removed)

dn: olcDatabase={1}mdb,cn=config
changetype: modify
add: olcSyncRepl
olcSyncRepl: rid=010
  provider=ldap://master-1.example.com:389/
  bindmethod=simple
  binddn="uid=rpuser,dc=example,dc=com"
  credentials=banana
  searchbase="dc=example,dc=com"
  type=refreshAndPersist
  retry="30 5 300 3"
  interval=00:00:05:00

here is the applied config on the slave server

# {1}mdb, config
dn: olcDatabase={1}mdb,cn=config
objectClass: olcDatabaseConfig
objectClass: olcMdbConfig
olcDatabase: {1}mdb
olcDbDirectory: /var/lib/ldap
olcSuffix: dc=squaretrade,dc=com
olcAccess: {0}to attrs=userPassword by self write by anonymous auth by * none
olcAccess: {1}to attrs=shadowLastChange by self write by * read
olcAccess: {2}to * by * read
olcLastMod: TRUE
olcRootDN: cn=admin,dc=example,dc=com
olcSyncrepl: {0}rid=010 provider=ldap://master-1.example.com
 :389/ bindmethod=simple binddn="uid=rpuser,dc=example,dc=com" credentials
 =banana searchbase="dc=example,dc=com" type=refreshAndPersist retry="30 5
  300 3" interval=00:00:05:00
olcDbCheckpoint: 512 30
olcDbIndex: objectClass eq
olcDbIndex: cn,uid eq
olcDbIndex: uidNumber,gidNumber eq
olcDbIndex: member,memberUid eq
olcDbMaxSize: 1073741824


here is the syncprov config on the master it is communicating with

# {0}syncprov, {1}mdb, config
dn: olcOverlay={0}syncprov,olcDatabase={1}mdb,cn=config
objectClass: olcOverlayConfig
objectClass: olcSyncProvConfig
olcOverlay: {0}syncprov


My questions
1> does the slave also require the cn=config database replication?
2> do the masters need similar configs (i.e. like the n-master config) does RID=010 also need to be configured on the master?


here is a section of logs from a sync attempt

Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: epoll: listen=9 active_threads=0 tvp=zero
Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: epoll: listen=10 active_threads=0 tvp=zero
Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: epoll: listen=11 active_threads=0 tvp=zero
Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: =>do_syncrepl rid=010
Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: epoll: listen=12 active_threads=0 tvp=zero
Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: epoll: listen=13 active_threads=0 tvp=zero
Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: =>do_syncrep2 rid=010
Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: do_syncrep2: rid=010 got search entry without Sync State control (dc=example,dc=com)
Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: do_syncrepl: rid=010 rc -1 retrying (1 retries left)
Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: activity on 1 descriptor
Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: activity on:
Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]:
Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: epoll: listen=9 active_threads=0 tvp=zero
Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: epoll: listen=10 active_threads=0 tvp=zero
Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: epoll: listen=11 active_threads=0 tvp=zero
Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: epoll: listen=12 active_threads=0 tvp=zero
Apr 18 09:27:36 la1-ldap-slave-prod-1 slapd[14543]: daemon: epoll: listen=13 active_threads=0 tvp=zero




any assistance would be greatly appreciated!
and if there is additional information that will help just ask!

Christopher