Re: Content Sync Refresh Required

28 Feb 2023


      On Tue, Feb 28, 2023 at 01:42:20PM +0100, Geert Hendrickx wrote:
...
We've had (and still have) this issue with large attributes and large 
multi-valued attributes with Zimbra (see previous discussion with Quanah),
where we applied sortvals and multival.  But in this scenario it's not the
case; all objects are of similar small size, with (mostly) single valued
attributes.  Yet our freelist reaches 200K+ free pages during periods with
heavy updates (mostly deletes/adds), which has a measurable impact on write
performance.
Hi Geert,
are you sure it's the freelist and not the random access as pages become
non-contiguous? The former would represent a constant decline in
performance where the latter would eventually taper from high (best
case) performance to regular performance you should be able to expect?
Have you been able to rule that out?
...
For batch migrations we recently tried combining multiple updates into LDAP
transactions, which is significantly faster on a clean db, but makes the
freelist performance impact *worse* once the freelist is large enough.
Could it be because transactions require larger free pages which makes it
go through the entire freelist?
Someone else needs to comment on that.
...
...
It should be safe to include the accesslog *if* server was shut down
cleanly and everything was flushed into both.
Should nightly backups include the accesslog as well then?  (implying we
can no longer make simple mdb_copy backups while slapd is running...
Or is it good enough to dump the accesslog *after* the main db, so it
includes the relevant AND newer accesslog data?)
Disaster recovery does not need accesslog unless you need it for
auditing purposes, but given you are happy to wipe it, I don't think
that's the case. What you're doing here is not disaster recovery and you
can't do this online.
...
...
Do you configure persistent or in-memory sessionlog?
in memory
After you kill accesslog, you disable deltasync. Since you're also
restarting, the provider has no data on how to replay anything and needs
to send the list of all entries (at least their UUIDs). This is
expensive and slow. Replication seems to proceed in slow leaps that cost
a *lot* of processing on the provider and a fair amount of bandwidth.
Isn't that what you're seeing?
...
...
Are your accesslog entries so large that they don't fit a page? If not,
just let the freelist be reused for the next time you have a large batch
of updates again. That's what it's there for. And even then, accesslog
in particular shouldn't really suffer from fragmentation as much as the
main DB would.
Ok.  We're seeing 1M+ free pages in the accesslog after large batch jobs
and subsequent logpurge.  It could be completely innocent like you say,
we just clean it up as a precaution due to the proven main db perf issue
mentioned above.  We'll hold this off for now and see.
As I always suggest, when you have a hypothesis, see if you can test it
before you implement something in production. If you can also report
here how it went, we can help you confirm/form a better one. In this
case, I think performance settles and you can adjust resources to match
your requirements.
...
...
Yeah, in δ-multiprovider both main DB and accesslog (and their
contextCSNs) are used together and should be monitored as such.
Thanks for your advice, will revise the monitoring.
Regards,
-- 
Ondřej Kuzník
Senior Software Engineer
Symas Corporation                       http://www.symas.com
Packaged, certified, and supported LDAP solutions powered by OpenLDAP

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

Re: Content Sync Refresh Required