Hello,
we have a cluster of LDAP servers consisting of one provider and 4 consumer. We're upgrading the os of the systems by proving an replacement system for each of the five systems. Then we stop the slapd on the system that should be replaced, dump the database with slapcat, copy it to new system, switch hostname and ip of both systems. We shut down the old and reboot the new system. Then we slapadd the database on the new system. So far so good...
We started with the provider. After importing the database and starting the slapd on the new provider, we get errors for the syncrepl state on all consumer systems: "Can't get Context CSN with SID <x> from ldap+tls://localhost. Please set SID with -I option."
We're using check_ldap_syncrepl_status from ltb-project with icinga2 to monitor the replication.
I don't know how to fix this. When I modify an entry in the provider, it is synced to the consumer. But the status error stays. On all four consumers.
I've then 'upgraded' a consumer by starting the slapd on the replacment system without any database. The system started the sync and after completion the error was gone. But syncing takes more time than importing a dump from a local file. And in a case where we have to rebuild the provider from scratch after a crash, it might be not an option to resync one consumer after the other to rebuild the complete cluster.
Is there a method to fix the problem?
Syncrepl configuration on the provider
overlay syncprov syncprov-checkpoint 100 10 syncprov-sessionlog 100
Syncrepl configuration on the consumer
syncrepl rid=<x> provider=ldap://yxz:389 binddn="cn=xxxxxxx" credentials="xxxxxxx" tls_cacertdir=/etc/pki/tls/certs tls_reqcert=demand bindmethod=simple starttls=critical searchbase="xxxxxx" type=refreshAndPersist retry="10 10 300 +" filter="(objectClass=*)" scope=sub attrs="*,+" sizelimit=unlimited timelimit=unlimited schemachecking=off
Regards
Berthold Cogel