Re: Content Sync Refresh Required

28 Feb 2023


      On Mon, Feb 27, 2023 at 08:12:44PM +0100, Geert Hendrickx wrote:
...
On Mon, Feb 27, 2023 at 19:18:38 +0100, Ondřej Kuzník wrote:
...
Hi Geert,
you didn't answer the questions whether you also monitor the accesslog's
contextCSN? In deltasync, the combination of both is important.
Ok, we don't.  I'll take a look next time things are drifting.
In a stable environment, the accesslog's contextCSN is identical to the
main db's contextCSN, for every SID.
Hi Geert,
yes, accesslog's contextCSN should always be in sync with its main DB.
...
...
My emphasis on "incremental". Usually when contextCSN and cookie are
found to be incompatible (missing sids from cookie, or even tighter
constraints when configured with syncprov-sessionlog-source), it tells
the consumer to step down from deltasync or start without a cookie.
Ok, I assumed the consumer decides on its own if it's in sync with a given
provider, by comparing its contextCSN to the provider's, and only if it's
NOT in sync, query the provider's accesslog for delta sync from the CSN
it's currently at, if possible.
Except for initial sync (no data in consumer), the consumer always tries
deltasync first. Provider then proceeds accordingly or tells the
consumer to fall back to plain syncrepl (the most common reason to see
e-syncRefreshRequired - 4096).
...
...
...
At least for the main db, it makes a significant performance difference if
the accesslog gets too large.  Therefor we mdb_copy -c the database from
time to time.  We do this on one server, then distribute this mdb to other
servers and drop their accesslog, since it doesn't match the (imported)
main db anymore.  But then other replica's start logging "Content Sync
Refresh Required" for the corresponding rid, even if no updates are coming
in through *that* server, so its contextCSN is static.
You mean accesslog DB or accesslog freelist? Also now you're saying
you're obliterating the whole accesslog (and compacting the mainDB),
where previously you said you were compacting accesslog.
We are compacting (mdb_copy -c) the main db on one server, AND throwing
away the accesslog on other servers where we import this mdb, because it
then no longer matches the local accesslog.
Context at https://openldap.org/lists/openldap-technical/201708/msg00049.html
(although this was about a different LDAP database than the one we're
currently talking about.)
Unless your entries are larger than pagesize *and* you have massive
churn on those, you don't want to do this. Are you confident that's the
case? What is your number of overflow pages? What kind of entries is
it down to? If it's entries with large number of values in an attribute
(e.g. groups), you might also want to look into sortvals (see man 5
slapd.conf) and multival (man 5 slapd-mdb) to store them more
efficiently.
...
This always went fine, but now turns out to confuse other consumers in an
MMR environment.  Should we instead run mdb_copy -c locally on each server?
(this can be a pretty slow operation)  Or is there another "clean" way to
copy mdb databases between replica's?  Include the corresponding accesslog?
It should be safe to include the accesslog *if* server was shut down
cleanly and everything was flushed into both.
Do you configure persistent or in-memory sessionlog?
...
The other scenario was that after large batch updates, when the accesslog
has grown much bigger than usual (which is not a problem in itself), after
logpurge this leaves a large freelist in the accesslog as well.  So as a
precaution, we "clean up" here as well by just dropping that accesslog -
obviously at a quite time and when all servers are in sync.  This turned
out to be a mistake.
Are your accesslog entries so large that they don't fit a page? If not,
just let the freelist be reused for the next time you have a large batch
of updates again. That's what it's there for. And even then, accesslog
in particular shouldn't really suffer from fragmentation as much as the
main DB would.
...
...
If this is the case then yeah, you're removing every way the servers
could have performed an efficient resync after reconnect/restart and
that will take time and processing power to perform (probably running a
refresh present, which is only one step up from a total resync). This
makes little sense operationally.
Ok, so far we only looked at the contextCSN of the main DIT, assuming this
told the whole story.
Yeah, in δ-multiprovider both main DB and accesslog (and their
contextCSNs) are used together and should be monitored as such.
Regards,
-- 
Ondřej Kuzník
Senior Software Engineer
Symas Corporation                       http://www.symas.com
Packaged, certified, and supported LDAP solutions powered by OpenLDAP

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

Re: Content Sync Refresh Required