Full_Name: Version: 2.4.8 OS: Linux 2.6.23.13 URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (194.97.7.65)
Were still migrating von slurpd to syncrepl. We have a 2.3-slurpd-fed provider under OpenLDAP 2.4.8 used by currently two OpenLDAP 2.4.8 consumers with "refreshAndPersist" and a retry-interval of "180 +" (see below)
The database is fairly large (8 Mio objects, 15 GB BDB files) and the consumers have a syncrepled database (pulled once, then cloned). There are ongoing updates "each second".
We see two issues:
When a consumer (re-) connects, the replication for the running consumer pauses until it continues on both nodes. This could become a problem, where a single consumer can be a SPOF for the whole cluster.
The replication may suddenly stop, and we see no reason for that. Sometimes it recovers by itself, sometimes it recovers only when the consumer slapd is stopped and restarted. Is there a "golden path" to debug this issue?
For the logs: We have "sync" as log level.
Consumer 1 (rid 000) [...] Apr 10 16:15:18 0 slapd[27548]: connection_input: conn=192 deferring operation: binding [...] Apr 11 08:45:21 0 slapd[27548]: connection_input: conn=5182 deferring operation: binding Apr 11 08:50:11 0 slapd[27548]: connection_input: conn=5207 deferring operation: binding Apr 11 08:51:09 0 slapd[27548]: connection_input: conn=5212 deferring operation: binding Apr 11 08:53:05 0 slapd[27548]: connection_input: conn=5222 deferring operation: binding Apr 11 08:54:03 0 slapd[27548]: connection_input: conn=5228 deferring operation: binding
Consumer 2 (rid 002): [...] Apr 11 02:00:16 2 slapd[22272]: connection_input: conn=3151 deferring operation: binding [...] Apr 11 08:52:09 2 slapd[22272]: connection_input: conn=5224 deferring operation: binding Apr 11 08:53:07 2 slapd[22272]: connection_input: conn=5229 deferring operation: binding
Provider (filtered for last syncprov_sendresp entries per rid) Apr 10 16:06:43 1 slapd[27917]: syncprov_sendresp: cookie=rid=000,csn=2008041014 0643.069039Z#000000#000#000000 Apr 11 01:57:58 1 slapd[27917]: syncprov_sendresp: cookie=rid=002,csn=2008041023 5758.696324Z#000000#000#000000
I see no indicators for failing operations, missing resources or any other trouble, incept for the "connection_input" entries.
Any idea how to get sync replication running smoothly and reliably? Thank You very much in advance, of course!
OK, here are configuraiton snippets for provider and consumer, if it helps. To be honest: I am not quite shure about a reasonable size for syncprov-sessionlog, even after reading the slapo-syncprov. The actual number is meant as "big, but finite".
Consumer: ---------
syncrepl rid=2 provider="ldap://provider:389" type=refreshAndPersist bindmethod=simple binddn=... credentials=... searchbase=... retry="180 +"
provider: ---------
database bdb dbnosync cachesize 200000 suffix ... rootdn ... updatedn ... rootpw ... directory /var/ldap/db lastmod on index objectClass eq ... index entryCSN,entryUUID eq
checkpoint 512 15 overlay syncprov
syncprov-sessionlog 300000
database monitor