On Thu, Mar 04, 2021 at 01:09:55PM +0100, Michael Ströder wrote:
On 3/4/21 12:20 PM, Ondřej Kuzník wrote:
> If it takes 1 second to replicate a change and a previous change
> happened x seconds before this one there's going to be a window of 1
> second where you see an x second CSN difference between the provider and
> consumer. In no way does it mean the consumer is x seconds behind.
I'm talking about the contextCSN difference being visible for several
*hours* while the changes have been already successfully replicated.
Replication delay is very short, syncrepl type is refreshAndPersist.
Don't think I've ever seen this outside slapcat (only checkpoints affect
the on-disk version). Please submit a bug if you can replicate this.
> If there's an acceptable delay of n seconds, you better wait
> amount of time before raising an alarm,
And what's an appropriate value for n? 86400? ;-]
Depends where in the galaxy you place your replicas :)
> See the logic in syncmonitor
Ideally I'd like to query cn=monitor whether slapd thinks replication is
in a healthy state.
Consumer will never think its replication is slow/broken (unless it gets
an actual error and you can already see that in cn=monitor). Provider
might want to expose some information but that's not implemented yet and
will not be able to spot many issues if other providers exist.
Senior Software Engineer
Symas Corporation http://www.symas.com
Packaged, certified, and supported LDAP solutions powered by OpenLDAP