Guillaume Rousse wrote:
Hello list.
I'm using delta-syncrepl in search-and-persist mode between my slaves and my master server. And I'm using a nagios plugin to check sync status, based on value of contextCSN attribute. But I'm often sync alerts for unknown reasons.
First issue, is this an expected result to have an higher contextCSN on the slave side ? From what I've understood from contextCSN, this attribute is updated each time a write operation is performed on the server. As the slave server is not supposed not to perform any write operation, this should never happens. However, it does:
Ordinarily, a slave cannot initiate any write operations. However, you appear to be using ppolicy. The ppolicy overlay writes Bind status updates to the local server, regardless of master or slave status. Thus, it can cause the slave's contextCSN to be newer than the master's.
[root@etoile ~]# /usr/share/nagios/plugins/check_syncrepl.py ldap://ldap1.msr-inria.inria.fr ldap://ldap2.msr-inria.inria.fr -b dc=msr-inria,dc=inria,dc=fr -v [..] 2009-05-25 13:36:49,740 - check_syncrepl.py - DEBUG - Retrieving Provider contextCSN 2009-05-25 13:36:49,741 - check_syncrepl.py - DEBUG - contextCSN = 20090520141922.274229Z#000000#000#000000 2009-05-25 13:36:49,742 - check_syncrepl.py - DEBUG - Retrieving Consumer contextCSN 2009-05-25 13:36:49,742 - check_syncrepl.py - DEBUG - contextCSN = 20090525095027.118111Z#000000#000#000000 2009-05-25 13:36:49,752 - check_syncrepl.py - INFO - Consumer NOT in SYNCH 2009-05-25 13:36:49,753 - check_syncrepl.py - INFO - Delta is -5 days, 4:28:55
Second issue, how does syncrepl sync operational attributes ? When using ppolicy, for instance, each failed bind operation result in a pwdChangedTime attribute added to the user entry. From my own attempts, the slave and the master maintain their own list separatly.
As synchronisation is performed from master to slave only, it seems quite logical failed authentication on the slaves doesn't impact the user entry on the master. However, from reading slapd.conf man page, syncrepl is supposed to synchronise operational attributes too by default: The attrs list defaults to "*,+" to return all user and operational attributes, and attrsonly is unset by default.
Also, the logs on the slave clearly show something happens when a failed autentication is performed on the master.
Start state: Provider contextCSN = 20090525122053.257812Z#000000#000#000000 Consumer contextCSN = 20090525122053.257812Z#000000#000#000000
Logs: May 25 14:18:34 nation slapd[28717]: do_syncrep2: cookie=rid=123,csn=20090525121834.036489Z#000000#000#000000 May 25 14:18:34 nation slapd[28717]: slap_queue_csn: queing 0x93d55d8 20090525121834.036489Z#000000#000#000000 May 25 14:18:34 nation slapd[28717]: slap_graduate_commit_csn: removing 0x9450550 20090525121834.036489Z#000000#000#000000 May 25 14:18:34 nation slapd[28717]: syncrepl_message_to_op: rid=123 be_modify uid=rousse,ou=users,dc=msr-inria,dc=inria,dc=fr (0) May 25 14:18:34 nation slapd[28717]: slap_queue_csn: queing 0x9461d18 20090525121834.036489Z#000000#000#000000 May 25 14:18:34 nation slapd[28717]: slap_graduate_commit_csn: removing 0x945e640 20090525121834.036489Z#000000#000#000000
End state Provider contextCSN = 20090525122053.257812Z#000000#000#000000 Consumer contextCSN = 20090525122122.287486Z#000000#000#000000
The provider didn't increases its contextCSN value, while performing a change, and the consumer did increase its own, while not performing the change :(
Here is my syncrepl configuration: syncrepl rid=123 provider=ldaps://ldap1.msr-inria.inria.fr type=refreshAndPersist retry="60 +" logbase="cn=log" logfilter="(&(objectClass=auditWriteObject)(reqResult=0))" syncdata=accesslog searchbase="dc=msr-inria,dc=inria,dc=fr" scope=sub schemachecking=off bindmethod=simple binddn="cn=syncrepl,ou=roles,dc=msr-inria,dc=inria,dc=fr" credentials=XXXXXX