Re: Sync replication failure during startup.

1 Oct 2007


      On Fri, 2007-09-28 at 17:02 -0700, Howard Chu wrote:
...
Stelios Grigoriadis wrote:
...
I have upgraded openldap to latest stable version (2.3.38) and
used Berkeley DB version 4.5.20. The problem remains. I realize
my analisys wasn't correct since, as Howard Chu pointed out, the
CSN contains both a timestamp and a counter. So the entryCSN:s 
are unique.
But, the problem remains and I have no idea why this happens.
I somehow still suspect that the problem still is in the initial
phase of the sync operation (refresh stage). It might be that,
some of the not-yet committed modifications don't make it into
the result set in the search operation. Later after another entry
is added, the "lost" entries are never to be synced over.
This also cannot be the cause. The contextCSN is snapshotted at the beginning of a 
refresh. Only updates between the consumer's cookie CSN and the snapshot CSN are 
sent to the consumer. Any entries added during this refresh will be excluded from 
the update, and the consumer will then record the snapshot CSN. Any entries the 
consumer didn't pick up in this refresh pass will be picked up in the next refresh.
I agree with you, I just didn't see the "next refresh" in the code. I
thought it refreshed only once and then the master would write back all
subsequent changes (syncprov_op_response -> syncprov_qstart etc.)
...
...
I will test some more and try to provide more information. I have
a test program that generates this problem but it is a little
cumbersome. I will try to slim it down and use more common schema
elements before posting it.
That will certainly help.
The setup to reproduce the error is as follows: 1 master, 3 replicas.
1. Start the replicas.
2. Start the program that adds persons (parallell_stress_simple.sh).
   Actually a script that starts a number of processes (add_person.c)
   on different machines that add persons.
3. Start the master.
4. When the script completes, compare the number of added entries in
   the master and replicas.
To Quanah Gibson-Mount: The slapd.conf i also provided.
/Stelios

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: Sync replication failure during startup.