https://bugs.openldap.org/show_bug.cgi?id=10274
--- Comment #9 from falgon.comp@gmail.com --- Hello, No problem same for us, thank's for your reply,
For your recommendation, we haven't implemented custom code to count the number of iterations of slap_graduate_commit_csn. Instead, we've made a script that analyzes all replication logs and the verification of each CSN after our test. We have find interesting things, Here are a few observations for the test we have share before :
If we Count the occurence of CSN for all servers we have : 176363 20241104132232.937968Z#000000#002#000000 143023 20241104132232.374890Z#000000#001#000000 69302 20241104130859.571000Z#000000#003#000000 69294 20241104130849.482824Z#000000#004#000000 23087 20240919072405.196064Z#000000#001#000000 11409 20241104121608.755579Z#000000#001#000000 7960 20241104121927.229914Z#000000#002#000000 921 20241104133528.749142Z#000000#001#000000 483 20241104133656.538323Z#000000#001#000000 ...
If we look up to 20241104130859.571000Z#000000#003#000000
Number of iterations by server server1.log:678 server2.log:46176 server3.log:11291 server4.log:11157
And here are the first CSN logs for each server
server1 : 2024-11-04T14:24:10.085660+01:00 server1 slapd-2.5-user[2432501]: conn=1064 op=1 syncprov_op_search: got a persistent search with a cookie=rid=001,sid=003,csn=20241104132232.374890Z#000000#001#000000;20241104132232.937968Z#000000#002#000000;20241104130859.571000Z#000000#003#000000;20241104130849.482824Z#000000#004#000000 2024-11-04T14:24:10.159051+01:00 server1 slapd-2.5-user[2432501]: conn=1066 op=1 syncprov_op_search: got a persistent search with a cookie=rid=001,sid=004,csn=20241104132232.374890Z#000000#001#000000;20241104132232.937968Z#000000#002#000000;20241104130859.571000Z#000000#003#000000;20241104130849.482824Z#000000#004#000000 2024-11-04T14:24:49.792914+01:00 server1 slapd-2.5-user[2432501]: conn=-1 op=0 syncprov_add_slog: adding csn=20241104130859.571000Z#000000#003#000000 to sessionlog, uuid=e800557e-2a1b-4f5c-8ff5-273fef541eb0 2024-11-04T14:32:14.833713+01:00 server1 slapd-2.5-user[2432501]: do_syncrep2: rid=004 cookie=rid=004,sid=004,csn=20241104132232.374890Z#000000#001#000000;20241104132232.937968Z#000000#002#000000;20241104130859.571000Z#000000#003#000000;20241104130849.482824Z#000000#004#000000 2024-11-04T14:34:15.116600+01:00 server1 slapd-2.5-user[2432501]: do_syncrep1: rid=002 starting refresh (sending cookie=rid=002,sid=001,csn=20241104132232.374890Z#000000#001#000000;20241104132232.937968Z#000000#002#000000;20241104130859.571000Z#000000#003#000000;20241104130849.482824Z#000000#004#000000) 2024-11-04T14:34:15.180789+01:00 server1 slapd-2.5-user[2432501]: do_syncrep1: rid=002 starting refresh (sending cookie=rid=002,sid=001,csn=20241104132232.374890Z#000000#001#000000;20241104132232.937968Z#000000#002#000000;20241104130859.571000Z#000000#003#000000;20241104130849.482824Z#000000#004#000000)
server2 : 2024-11-04T14:25:30.018539+01:00 server2 slapd-2.5-user[1777246]: conn=1046 op=1 syncprov_op_search: got a persistent search with a cookie=rid=002,sid=003,csn=20241104132232.374890Z#000000#001#000000;20241104132232.937968Z#000000#002#000000;20241104130859.571000Z#000000#003#000000;20241104130849.482824Z#000000#004#000000 2024-11-04T14:25:30.026018+01:00 server2 slapd-2.5-user[1777246]: conn=1047 op=1 syncprov_op_search: got a persistent search with a cookie=rid=002,sid=004,csn=20241104132232.374890Z#000000#001#000000;20241104132232.937968Z#000000#002#000000;20241104130859.571000Z#000000#003#000000;20241104130849.482824Z#000000#004#000000 2024-11-04T14:25:30.037401+01:00 server2 slapd-2.5-user[1777246]: conn=1048 op=1 syncprov_op_search: got a search with a cookie=rid=002,sid=003,csn=20241104132232.374890Z#000000#001#000000;20241104132232.937968Z#000000#002#000000;20241104130859.571000Z#000000#003#000000;20241104130849.482824Z#000000#004#000000 2024-11-04T14:25:30.041279+01:00 server2 slapd-2.5-user[1777246]: conn=1048 op=2 syncprov_op_search: got a persistent search with a cookie=rid=002,sid=003,csn=20241104132232.374890Z#000000#001#000000;20241104132232.937968Z#000000#002#000000;20241104130859.571000Z#000000#003#000000;20241104130849.482824Z#000000#004#000000
server3 : 2024-11-04T14:08:59.571006+01:00 server3 slapd-2.5-user[3405557]: slap_get_csn: conn=8684 op=0 generated new csn=20241104130859.571000Z#000000#003#000000 manage=1 2024-11-04T14:08:59.571012+01:00 server3 slapd-2.5-user[3405557]: slap_queue_csn: queueing 0x56290f862840 20241104130859.571000Z#000000#003#000000 2024-11-04T14:08:59.571930+01:00 server3 slapd-2.5-user[3405557]: conn=8684 op=0 syncprov_add_slog: adding csn=20241104130859.571000Z#000000#003#000000 to sessionlog, uuid=e800557e-2a1b-4f5c-8ff5-273fef541eb0 2024-11-04T14:08:59.571989+01:00 server3 slapd-2.5-user[3405557]: slap_queue_csn: queueing 0x5629040c2540 20241104130859.571000Z#000000#003#000000 2024-11-04T14:08:59.664457+01:00 server3 slapd-2.5-user[3405557]: conn=1073 op=1 syncprov_qresp: set up a new syncres mode=1 csn=20241104130859.571000Z#000000#003#000000 2024-11-04T14:08:59.664465+01:00 server3 slapd-2.5-user[3405557]: conn=1066 op=1 syncprov_qresp: set up a new syncres mode=1 csn=20241104130859.571000Z#000000#003#000000
server4 : 2024-11-04T14:08:59.665159+01:00 server4 slapd-2.5-user[3412537]: do_syncrep2: rid=003 cookie=rid=003,sid=003,csn=20241104130859.571000Z#000000#003#000000 2024-11-04T14:08:59.665242+01:00 server4 slapd-2.5-user[3412537]: slap_queue_csn: queueing 0x55e30b476000 20241104130859.571000Z#000000#003#000000 2024-11-04T14:08:59.666221+01:00 server4 slapd-2.5-user[3412537]: conn=-1 op=0 syncprov_add_slog: adding csn=20241104130859.571000Z#000000#003#000000 to sessionlog, uuid=e800557e-2a1b-4f5c-8ff5-273fef541eb0 2024-11-04T14:08:59.666272+01:00 server4 slapd-2.5-user[3412537]: slap_queue_csn: queueing 0x55e30af67f80 20241104130859.571000Z#000000#003#000000 2024-11-04T14:08:59.670261+01:00 server4 slapd-2.5-user[3412537]: conn=1044 op=1 syncprov_qresp: set up a new syncres mode=1 csn=20241104130859.571000Z#000000#003#000000
We can see for the server1 and server2 the firsts occurence of the CSN are after the restart of instances, when for server3 and server4 start at 14:08. Can this kind of behavior be used to determine whether there is a replication problem ? If you need more details, you just need to ask, if you want the full details I can add them in the google drive.
PS : We will also be able to start our tests with version 2.6 soon, we will keep you informed about the improvements.