Re: Replication issue during performance test with MMR configuration and LastBind enabled

15 Oct 2024

      Hello,
I'm writing again in this conversation to provide information and bring the
topic back.
Since the last message we have implemented the lastbind overlay with a
lastbindprecision setting of 1800 which has greatly reduced the replication
problem (but has not solved it completely).
Over the past month, we've set up and configured other openLDAP directories
with the same configuration for a new service.
The new directories present the same replication problem.
The difference is that, for these new directories, we have a lot more
modifications by MOD operations. The lastbind overlay is therefore no
longer a solution to this problem.
Have you had any reports of replication problems similar to ours over the
year?
Have you been able to investigate our case?
We need help to find a solution or workaround for our new directories.
If you need any further information, please don't hesitate to ask me
questions or read the previous posts in this conversation, including Ziani
Meheni's initial message.
Thanks in advance
Le mar. 5 déc. 2023 à 15:29, falgon.comp@gmail.com a écrit :
...
Hello, thank's for the answer.
Quanah Gibson-Mount wrote:
...
--On Thursday, November 23, 2023 5:33 PM +0000 falgon.comp(a)gmail.com
wrote:
...
...
b) olcLogLevel: stats sync

We running our tests with stats only. Meheni probably left this

configuration to check before sending the config here.
My point was more that it should be a multi-valued attribute with unique
values not a single valued attribute with 2 strings in the value.
Good to know thanks
...
...
h) For your benchmark test, this is probably not frequent enough, as
the
 purge will never run since you're saying only data - We've run
endurance
...
...
tests to include purging. This settings is from a month ago and we
have
...
...
change this settings multiples times for testing differents setup.To
add
...
...
the purge during tests, we actually  set it to 00+01:00 00+00:03. In
the
...
...
final configuration we will probably set it too : 03+00:00 00+00:03.
We
...
...
found that purging every 3 minutes reduced the impact on performance.
While correct that frequent purging is better, you missed my overall
point,
...
which is that when you're testing you likely want to purge data on a
shorter timescale when doing a benchmark.
Yes this is what we did
...
...
I repost a previous question here too : What are the exact messages
or
 errors messages we should find in case of a collision problem?
*If* there are collisions, you'll see the server falling back to REFRESH
mode.  But only if you have "sync" logging enabled.
syncrepl.c:                                             Debug(
LDAP_DEBUG_SYNC, "do_syncrep2: %s delta-sync lost sync on (%s),
switching
...
to REFRESH\n",
syncrepl.c:                                     Debug( LDAP_DEBUG_SYNC,
"do_syncrep2: %s delta-sync lost sync, switching to REFRESH\n",
--Quanah
We've done a lot of tests to reproduce the synchronization problem with
the sync logs enables and our OpenLDAP servers don't switch to REFRESH
mode. As we said, our OpenLDAP instances become very slow (as if they were
freezing) during replication and they can no longer replicate correctly,
resulting in a constant accumulation of delay when the problem occurs.
That's really strange.
We've done a lot of testing, and we have the following scenarios that we
don't understand.
-200,000 users in random mode -> Problem occurs
-200,000 users in sequential mode -> Problem occurs
-1 user in random mode -> No problem
-10 users in random mode -> No problem
We still have no explanation for these results.
Again thank's for your time and your help.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

Re: Replication issue during performance test with MMR configuration and LastBind enabled