Sync replication failure during startup.

List overview All Threads
Download

newer

older

(ITS#5177) refreshAndPersist race...

Re: (ITS#5171) hdb txn_checkpoint...

Stelios Grigoriadis

27 Sep 2007 27 Sep '07

9:56 a.m.

OpenLDAP v. 2.3.32 Berkeley DB 4.6 gcc 4.1.0

Replication doesn't work if the master server is started after the replica servers and a large amount of simoultaneous updates are performed while the server is starting up.

The entries that didn't get replicated to the replicas will not be replicated even after a restart of both master and replicas. The contextCSN is set to a value larger than the entryCSN of the "lost" entries.

This is what I think happens during a master server startup with simoultaneous updates ongoing (and replicas trying to sync in the initial phase).

Suppose that two clients (Client1 and Client2) are adding the entries a and b respectively. If that happens between t1 and t2 (one second between) they will get the same entryCSN (same timestamp). If entry a is committed at tc1 and b at tc2, any replica search inbetween will only get the entry a. The entry b will be lost.

Client1 entry=a, csn=x

Client2 entry=b, csn=x

Timeline ------+----------+---------+----+------> | | t1 | | t2=t1+1 | | tc1=entry a tc2=entry b committed committed

Replica search query between tc1 and tc2.

I don't know if a higher granularity would prevent this, or even better, to have some kind of a counter so that every modification gets a unique csn.

Can you please comment on our analyzis to let us know if the analyzis is correct or if we have missed something important?

Any help or hints on how to avoid or fix this problem is greatly appreciated.

If I receive useful information direcly in private email, I will post a summary.

Regards

Stelios Grigoriadis

Show replies by date

Howard Chu

27 Sep 27 Sep

7:04 p.m.

Stelios Grigoriadis wrote:

...

OpenLDAP v. 2.3.32 Berkeley DB 4.6 gcc 4.1.0

Replication doesn't work if the master server is started after the replica servers and a large amount of simoultaneous updates are performed while the server is starting up.

The entries that didn't get replicated to the replicas will not be replicated even after a restart of both master and replicas. The contextCSN is set to a value larger than the entryCSN of the "lost" entries.

This is what I think happens during a master server startup with simoultaneous updates ongoing (and replicas trying to sync in the initial phase).

Suppose that two clients (Client1 and Client2) are adding the entries a and b respectively. If that happens between t1 and t2 (one second between) they will get the same entryCSN (same timestamp).

No, entryCSNs include both a counter and a timestamp; multiple updates within the same time quantum will still have unique CSNs.

However, some bugs related to syncprov were fixed in 2.3.34 that may be relevant here. You should upgrade to the current stable release, 2.3.38 and see if the issue remains.

-- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Quanah Gibson-Mount

7:04 p.m.

--On Thursday, September 27, 2007 9:56 AM +0200 Stelios Grigoriadis stelios.xx.grigoriadis@ericsson.com wrote:

...

Any help or hints on how to avoid or fix this problem is greatly appreciated.

(a) OpenLDAP 2.3 does not support BDB 4.6. Use a supported version of BDB.

(b) Upgrade to the most recent release. If you had taken the time to go to the OpenLDAP website and read the changelog prior to filing your bug, you would have seen that there were fixes for syncrepl since OpenLDAP 2.3.32. If you can reproduce after upgrading and using a supported version of BDB, please follow up.

--Quanah

Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration

Howard Chu

7:09 p.m.

Stelios Grigoriadis wrote:

...

OpenLDAP v. 2.3.32 Berkeley DB 4.6

The officially released version of BerkeleyDB 4.6 doesn't work with OpenLDAP 2.3. Exactly which version of BerkeleyDB are you actually using? Saying "4.6" is about as useful as saying you're running on a "Linux 2.6 kernel" - without the full version number, you really haven't said anything useful.

-- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Stelios Grigoriadis

28 Sep 28 Sep

3:20 p.m.

I have upgraded openldap to latest stable version (2.3.38) and used Berkeley DB version 4.5.20. The problem remains. I realize my analisys wasn't correct since, as Howard Chu pointed out, the CSN contains both a timestamp and a counter. So the entryCSN:s are unique.

But, the problem remains and I have no idea why this happens. I somehow still suspect that the problem still is in the initial phase of the sync operation (refresh stage). It might be that, some of the not-yet committed modifications don't make it into the result set in the search operation. Later after another entry is added, the "lost" entries are never to be synced over.

I will test some more and try to provide more information. I have a test program that generates this problem but it is a little cumbersome. I will try to slim it down and use more common schema elements before posting it.

Regards

/Stelios

Quanah Gibson-Mount

10:36 p.m.

--On September 28, 2007 3:20:18 PM +0200 Stelios Grigoriadis stelios.xx.grigoriadis@ericsson.com wrote:

...

I have upgraded openldap to latest stable version (2.3.38) and used Berkeley DB version 4.5.20. The problem remains. I realize my analisys wasn't correct since, as Howard Chu pointed out, the CSN contains both a timestamp and a counter. So the entryCSN:s are unique.

But, the problem remains and I have no idea why this happens. I somehow still suspect that the problem still is in the initial phase of the sync operation (refresh stage). It might be that, some of the not-yet committed modifications don't make it into the result set in the search operation. Later after another entry is added, the "lost" entries are never to be synced over.

I will test some more and try to provide more information. I have a test program that generates this problem but it is a little cumbersome. I will try to slim it down and use more common schema elements before posting it.

Please provide the syncrepl configuration block from one of your replicas as well (minus any passwords).

--Quanah

Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration

William R. Somsky

10:44 p.m.

I've had a similar problem here. I don't really have much to add in the way of info for tracking it down, but here is an additional question:

Once a replica server gets out of sync w/ the master in this way, how does one get the replica back in sync? I've not been able to figure that out, short of deleting the entire database on the replica and rebuilding it from scratch, which can take a very long time.

On Fri, Sep 28, 2007 at 03:20:18PM +0200, Stelios Grigoriadis wrote:

...

I have upgraded openldap to latest stable version (2.3.38) and used Berkeley DB version 4.5.20. The problem remains. I realize my analisys wasn't correct since, as Howard Chu pointed out, the CSN contains both a timestamp and a counter. So the entryCSN:s are unique.

But, the problem remains and I have no idea why this happens. I somehow still suspect that the problem still is in the initial phase of the sync operation (refresh stage). It might be that, some of the not-yet committed modifications don't make it into the result set in the search operation. Later after another entry is added, the "lost" entries are never to be synced over.

I will test some more and try to provide more information. I have a test program that generates this problem but it is a little cumbersome. I will try to slim it down and use more common schema elements before posting it.

Regards

/Stelios

-- Dr. William R. Somsky, Linux Mgr somsky@phys.washington.edu Department of Physics, Box 351560 B217 Physics-Astro Bldg Univ. of Washington, Seattle WA 98195-1560 206-616-2954

Stelios Grigoriadis

1 Oct 1 Oct

1:43 p.m.

On Fri, 2007-09-28 at 13:44 -0700, William R. Somsky wrote:

...

I've had a similar problem here. I don't really have much to add in the way of info for tracking it down, but here is an additional question:

Once a replica server gets out of sync w/ the master in this way, how does one get the replica back in sync? I've not been able to figure that out, short of deleting the entire database on the replica and rebuilding it from scratch, which can take a very long time.

To my knowledge, you can't. Since the replica contextCSN is timestamped at a later time than the out-of-sync entries, they can't be synced.

/Stelios

Howard Chu

4:14 p.m.

Stelios Grigoriadis wrote:

...

On Fri, 2007-09-28 at 13:44 -0700, William R. Somsky wrote:

...
I've had a similar problem here. I don't really have much to add in the way of info for tracking it down, but here is an additional question:

Once a replica server gets out of sync w/ the master in this way, how does one get the replica back in sync? I've not been able to figure that out, short of deleting the entire database on the replica and rebuilding it from scratch, which can take a very long time.

IMO it's not worth answering this question until you know why things got out of sync in the first place.

...

To my knowledge, you can't. Since the replica contextCSN is timestamped at a later time than the out-of-sync entries, they can't be synced.

You obviously haven't read the slapd(8) manpage carefully enough. See the "-c" option.

-- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Stelios Grigoriadis

5:03 p.m.

On Mon, 2007-10-01 at 07:14 -0700, Howard Chu wrote:

...

Stelios Grigoriadis wrote:

...
On Fri, 2007-09-28 at 13:44 -0700, William R. Somsky wrote:

...
I've had a similar problem here. I don't really have much to add in the way of info for tracking it down, but here is an additional question:

Once a replica server gets out of sync w/ the master in this way, how does one get the replica back in sync? I've not been able to figure that out, short of deleting the entire database on the replica and rebuilding it from scratch, which can take a very long time.

IMO it's not worth answering this question until you know why things got out of sync in the first place.

...
To my knowledge, you can't. Since the replica contextCSN is timestamped at a later time than the out-of-sync entries, they can't be synced.

You obviously haven't read the slapd(8) manpage carefully enough. See the "-c" option.

No I didn't, my bad. And it works like a charm.

Some additional info. It seems to me that it is easier to reproduce this problem with the current stable version than the previous one (2.3.32).

/Stelios

Howard Chu

29 Sep 29 Sep

2:02 a.m.

Stelios Grigoriadis wrote:

...

I have upgraded openldap to latest stable version (2.3.38) and used Berkeley DB version 4.5.20. The problem remains. I realize my analisys wasn't correct since, as Howard Chu pointed out, the CSN contains both a timestamp and a counter. So the entryCSN:s are unique.

But, the problem remains and I have no idea why this happens. I somehow still suspect that the problem still is in the initial phase of the sync operation (refresh stage). It might be that, some of the not-yet committed modifications don't make it into the result set in the search operation. Later after another entry is added, the "lost" entries are never to be synced over.

This also cannot be the cause. The contextCSN is snapshotted at the beginning of a refresh. Only updates between the consumer's cookie CSN and the snapshot CSN are sent to the consumer. Any entries added during this refresh will be excluded from the update, and the consumer will then record the snapshot CSN. Any entries the consumer didn't pick up in this refresh pass will be picked up in the next refresh.

...

I will test some more and try to provide more information. I have a test program that generates this problem but it is a little cumbersome. I will try to slim it down and use more common schema elements before posting it.

That will certainly help.

-- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Stelios Grigoriadis

1 Oct 1 Oct

11:27 a.m.

On Fri, 2007-09-28 at 17:02 -0700, Howard Chu wrote:

...

Stelios Grigoriadis wrote:

...
I have upgraded openldap to latest stable version (2.3.38) and used Berkeley DB version 4.5.20. The problem remains. I realize my analisys wasn't correct since, as Howard Chu pointed out, the CSN contains both a timestamp and a counter. So the entryCSN:s are unique.

But, the problem remains and I have no idea why this happens. I somehow still suspect that the problem still is in the initial phase of the sync operation (refresh stage). It might be that, some of the not-yet committed modifications don't make it into the result set in the search operation. Later after another entry is added, the "lost" entries are never to be synced over.

This also cannot be the cause. The contextCSN is snapshotted at the beginning of a refresh. Only updates between the consumer's cookie CSN and the snapshot CSN are sent to the consumer. Any entries added during this refresh will be excluded from the update, and the consumer will then record the snapshot CSN. Any entries the consumer didn't pick up in this refresh pass will be picked up in the next refresh.

I agree with you, I just didn't see the "next refresh" in the code. I thought it refreshed only once and then the master would write back all subsequent changes (syncprov_op_response -> syncprov_qstart etc.)

...

...
I will test some more and try to provide more information. I have a test program that generates this problem but it is a little cumbersome. I will try to slim it down and use more common schema elements before posting it.

That will certainly help.

The setup to reproduce the error is as follows: 1 master, 3 replicas.

1. Start the replicas. 2. Start the program that adds persons (parallell_stress_simple.sh). Actually a script that starts a number of processes (add_person.c) on different machines that add persons. 3. Start the master. 4. When the script completes, compare the number of added entries in the master and replicas.

To Quanah Gibson-Mount: The slapd.conf i also provided.

/Stelios

Quanah Gibson-Mount

6:09 p.m.

...

...
That will certainly help.

The setup to reproduce the error is as follows: 1 master, 3 replicas.

Start the replicas.

Start the program that adds persons (parallell_stress_simple.sh). Actually a script that starts a number of processes (add_person.c) on different machines that add persons.

Start the master.

When the script completes, compare the number of added entries in the master and replicas.

To Quanah Gibson-Mount: The slapd.conf i also provided.

Thanks for the info, I'll see if I can reproduce it.

--Quanah

Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration

Howard Chu

5 Oct 5 Oct

12:01 p.m.

Stelios Grigoriadis wrote:

...

The setup to reproduce the error is as follows: 1 master, 3 replicas.

I've been traveling and haven't yet been able to run this test. In the meantime, can you run slapd with debug level sync, and upload those logs somewhere for this test?

...

Start the replicas.

Start the program that adds persons (parallell_stress_simple.sh). Actually a script that starts a number of processes (add_person.c) on different machines that add persons.

Start the master.

When the script completes, compare the number of added entries in the master and replicas.

To Quanah Gibson-Mount: The slapd.conf i also provided.

/Stelios

-- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Stelios Grigoriadis

3:56 p.m.

Howard Chu wrote:

...

Stelios Grigoriadis wrote:

...
The setup to reproduce the error is as follows: 1 master, 3 replicas.

I've been traveling and haven't yet been able to run this test. In the meantime, can you run slapd with debug level sync, and upload those logs somewhere for this test?

...

Start the replicas.

Start the program that adds persons (parallell_stress_simple.sh). Actually a script that starts a number of processes (add_person.c) on different machines that add persons.

Start the master.

When the script completes, compare the number of added entries in the master and replicas.

To Quanah Gibson-Mount: The slapd.conf i also provided.

/Stelios

The output from one run. I haven't had the time to look at it yet.

ldapt1:~/openldap-2.3.38/servers/slapd # ./slapd -d 16384 @(#) $OpenLDAP: slapd 2.3.38 (Oct 5 2007 14:08:48) $ root@ldapt1:/root/openldap-2.3.38/servers/slapd /etc/openldap/slapd.conf: line 101: rootdn is always granted unlimited privileges. /etc/openldap/slapd.conf: line 135: rootdn is always granted unlimited privileges. slapd starting => bdb_idl_insert_key: c_put id failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) => bdb_dn2id_add: parent (dc=example,dc=com) insert failed: -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_idl_insert_key: c_put id failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) => bdb_dn2id_add: parent (dc=example,dc=com) insert failed: -30995 => bdb_idl_insert_key: c_put id failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) => bdb_dn2id_add: parent (dc=example,dc=com) insert failed: -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_idl_insert_key: c_put id failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) => bdb_dn2id_add: parent (dc=example,dc=com) insert failed: -30995 => bdb_dn2id_add: subtree (cn=dlt6107-0,dc=example,dc=com) put failed: -30995 => bdb_dn2id_add: subtree (cn=dlt6172-0,dc=example,dc=com) put failed: -30995 => bdb_dn2id_add: subtree (cn=dlt6139-0,dc=example,dc=com) put failed: -30995 => bdb_dn2id_add: subtree (cn=dlt6123-0,dc=example,dc=com) put failed: -30995 => bdb_dn2id_add: subtree (cn=dlt6080-0,dc=example,dc=com) put failed: -30995 => bdb_idl_insert_key: c_put id failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) => bdb_dn2id_add: parent (dc=example,dc=com) insert failed: -30995 => bdb_idl_insert_key: c_put id failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) => bdb_dn2id_add: parent (dc=example,dc=com) insert failed: -30995 => bdb_dn2id_add: subtree (cn=dlt30715-1,dc=example,dc=com) put failed: -30995 => bdb_idl_insert_key: c_put id failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) => bdb_dn2id_add: parent (dc=example,dc=com) insert failed: -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_idl_insert_key: c_put id failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) => bdb_dn2id_add: parent (dc=example,dc=com) insert failed: -30995 => bdb_dn2id_add: subtree (cn=dlt6172-0,dc=example,dc=com) put failed: -30995 => bdb_dn2id_add: subtree (cn=dlt6080-0,dc=example,dc=com) put failed: -30995 => bdb_dn2id_add: subtree (cn=dlt6107-0,dc=example,dc=com) put failed: -30995 => bdb_dn2id_add: subtree (cn=dlt6123-0,dc=example,dc=com) put failed: -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_dn2id_add: subtree (cn=dlt6139-0,dc=example,dc=com) put failed: -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_dn2id_add: subtree (cn=dlt6123-0,dc=example,dc=com) put failed: -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_idl_insert_key: c_put id failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) => bdb_dn2id_add: parent (dc=example,dc=com) insert failed: -30995 => bdb_dn2id_add: subtree (cn=dlt6107-0,dc=example,dc=com) put failed: -30995 => bdb_dn2id_add: subtree (cn=dlt6080-0,dc=example,dc=com) put failed: -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_idl_insert_key: c_put id failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) => bdb_dn2id_add: parent (dc=example,dc=com) insert failed: -30995 => bdb_dn2id_add: subtree (cn=dlt6139-0,dc=example,dc=com) put failed: -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_idl_insert_key: c_put id failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) => bdb_dn2id_add: parent (dc=example,dc=com) insert failed: -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_dn2id_add: subtree (cn=dlt6123-0,dc=example,dc=com) put failed: -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 => bdb_idl_insert_key: c_put id failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) => bdb_dn2id_add: parent (dc=example,dc=com) insert failed: -30995 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 Entry cn=dlt30604-0,dc=example,dc=com CSN 20071005135356Z#00000e#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt21127-0,dc=example,dc=com CSN 20071005135356Z#00001d#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt6107-0,dc=example,dc=com CSN 20071005135356Z#00001f#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt6172-0,dc=example,dc=com CSN 20071005135356Z#000020#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt30678-0,dc=example,dc=com CSN 20071005135356Z#000021#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt6080-0,dc=example,dc=com CSN 20071005135356Z#000024#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt21126-0,dc=example,dc=com CSN 20071005135356Z#000025#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt21125-0,dc=example,dc=com CSN 20071005135356Z#000026#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt9984-0,dc=example,dc=com CSN 20071005135356Z#000027#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt30715-1,dc=example,dc=com CSN 20071005135356Z#000028#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt30604-1,dc=example,dc=com CSN 20071005135357Z#000000#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt9934-0,dc=example,dc=com CSN 20071005135357Z#000001#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt9952-1,dc=example,dc=com CSN 20071005135357Z#000002#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt21127-1,dc=example,dc=com CSN 20071005135357Z#000003#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt9984-1,dc=example,dc=com CSN 20071005135357Z#000004#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt30678-1,dc=example,dc=com CSN 20071005135357Z#000005#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt9923-1,dc=example,dc=com CSN 20071005135357Z#000006#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt21125-1,dc=example,dc=com CSN 20071005135358Z#000000#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt6172-1,dc=example,dc=com CSN 20071005135358Z#000001#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt30616-1,dc=example,dc=com CSN 20071005135358Z#000002#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt6107-1,dc=example,dc=com CSN 20071005135358Z#000003#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt9934-1,dc=example,dc=com CSN 20071005135358Z#000004#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt30604-0,dc=example,dc=com CSN 20071005135356Z#00000e#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt21127-0,dc=example,dc=com CSN 20071005135356Z#00001d#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt6107-0,dc=example,dc=com CSN 20071005135356Z#00001f#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt6172-0,dc=example,dc=com CSN 20071005135356Z#000020#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt30678-0,dc=example,dc=com CSN 20071005135356Z#000021#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt6080-0,dc=example,dc=com CSN 20071005135356Z#000024#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt21126-0,dc=example,dc=com CSN 20071005135356Z#000025#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt21125-0,dc=example,dc=com CSN 20071005135356Z#000026#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt9984-0,dc=example,dc=com CSN 20071005135356Z#000027#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt30715-1,dc=example,dc=com CSN 20071005135356Z#000028#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt30604-1,dc=example,dc=com CSN 20071005135357Z#000000#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt9934-0,dc=example,dc=com CSN 20071005135357Z#000001#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt9952-1,dc=example,dc=com CSN 20071005135357Z#000002#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt21127-1,dc=example,dc=com CSN 20071005135357Z#000003#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt9984-1,dc=example,dc=com CSN 20071005135357Z#000004#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt30678-1,dc=example,dc=com CSN 20071005135357Z#000005#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt9923-1,dc=example,dc=com CSN 20071005135357Z#000006#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt21125-1,dc=example,dc=com CSN 20071005135358Z#000000#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt6172-1,dc=example,dc=com CSN 20071005135358Z#000001#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt30616-1,dc=example,dc=com CSN 20071005135358Z#000002#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt6107-1,dc=example,dc=com CSN 20071005135358Z#000003#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt9934-1,dc=example,dc=com CSN 20071005135358Z#000004#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 => bdb_dn2id_add: put failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock -30995 Entry cn=dlt30604-0,dc=example,dc=com CSN 20071005135356Z#00000e#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt21127-0,dc=example,dc=com CSN 20071005135356Z#00001d#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt6107-0,dc=example,dc=com CSN 20071005135356Z#00001f#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt6172-0,dc=example,dc=com CSN 20071005135356Z#000020#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt30678-0,dc=example,dc=com CSN 20071005135356Z#000021#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt6080-0,dc=example,dc=com CSN 20071005135356Z#000024#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt21126-0,dc=example,dc=com CSN 20071005135356Z#000025#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt21125-0,dc=example,dc=com CSN 20071005135356Z#000026#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt9984-0,dc=example,dc=com CSN 20071005135356Z#000027#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt30715-1,dc=example,dc=com CSN 20071005135356Z#000028#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt30604-1,dc=example,dc=com CSN 20071005135357Z#000000#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt9934-0,dc=example,dc=com CSN 20071005135357Z#000001#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt9952-1,dc=example,dc=com CSN 20071005135357Z#000002#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt21127-1,dc=example,dc=com CSN 20071005135357Z#000003#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt9984-1,dc=example,dc=com CSN 20071005135357Z#000004#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt30678-1,dc=example,dc=com CSN 20071005135357Z#000005#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt9923-1,dc=example,dc=com CSN 20071005135357Z#000006#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt21125-1,dc=example,dc=com CSN 20071005135358Z#000000#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt6172-1,dc=example,dc=com CSN 20071005135358Z#000001#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt30616-1,dc=example,dc=com CSN 20071005135358Z#000002#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt6107-1,dc=example,dc=com CSN 20071005135358Z#000003#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt9934-1,dc=example,dc=com CSN 20071005135358Z#000004#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000 Entry cn=dlt21126-1,dc=example,dc=com CSN 20071005135358Z#000005#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000

Howard Chu

5:19 p.m.

Stelios Grigoriadis wrote:

...

The output from one run. I haven't had the time to look at it yet.

The key question is, do any of these debug messages correspond to entries that you've found are missing on the replicas?

...

Entry cn=dlt30604-0,dc=example,dc=com CSN 20071005135356Z#00000e#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000

-- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Stelios Grigoriadis

8 Oct 8 Oct

11:19 a.m.

Howard Chu wrote:

...

Stelios Grigoriadis wrote:

...
The output from one run. I haven't had the time to look at it yet.

The key question is, do any of these debug messages correspond to entries that you've found are missing on the replicas?

...
Entry cn=dlt30604-0,dc=example,dc=com CSN 20071005135356Z#00000e#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000

The entries that are missing from the replicas are the ones with "greater than snapshot". I commented out the if-statement around line 1869 in syncprov.c and couldn't reproduce the problem.

On a side note, if an exception would be made for refresh and persist and send results greater than the snapshot:ed ctxcsn, what are the side effects of this? Is it possible to get "ghost" entries in the replicas? Suppose that an add operation fails to commit but the entry is already replicated, is that possible?

Howard Chu

11:39 a.m.

Stelios Grigoriadis wrote:

...

Howard Chu wrote:

...
Stelios Grigoriadis wrote:

...
The output from one run. I haven't had the time to look at it yet.

The key question is, do any of these debug messages correspond to entries that you've found are missing on the replicas?

...
Entry cn=dlt30604-0,dc=example,dc=com CSN 20071005135356Z#00000e#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000

The entries that are missing from the replicas are the ones with "greater than snapshot". I commented out the if-statement around line 1869 in syncprov.c and couldn't reproduce the problem.

On a side note, if an exception would be made for refresh and persist and send results greater than the snapshot:ed ctxcsn, what are the side effects of this? Is it possible to get "ghost" entries in the replicas? Suppose that an add operation fails to commit but the entry is already replicated, is that possible?

No, not possible. The backends will not return uncommitted entries in search responses.

The only possible side-effect is that the replicas may receive some entries twice if they are picked up during both the refresh phase and during the persist phase. That is, it will consume a little extra network bandwidth.

I'm looking at restructuring the search code to try to avoid this duplication.

-- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Howard Chu

2:01 p.m.

Stelios Grigoriadis wrote:

...

Howard Chu wrote:

...
Stelios Grigoriadis wrote:

...
The output from one run. I haven't had the time to look at it yet.

The key question is, do any of these debug messages correspond to entries that you've found are missing on the replicas?

...
Entry cn=dlt30604-0,dc=example,dc=com CSN 20071005135356Z#00000e#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000

The entries that are missing from the replicas are the ones with "greater than snapshot". I commented out the if-statement around line 1869 in syncprov.c and couldn't reproduce the problem.

Can you please try the patch in rev 1.205 of syncprov.c and see whether your test results improve? (You should remove your other change to this source file first.)

-- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Stelios Grigoriadis

2:35 p.m.

Howard Chu wrote:

...

Stelios Grigoriadis wrote:

...
Howard Chu wrote:

...
Stelios Grigoriadis wrote:

...
The output from one run. I haven't had the time to look at it yet.

The key question is, do any of these debug messages correspond to entries that you've found are missing on the replicas?

...
Entry cn=dlt30604-0,dc=example,dc=com CSN 20071005135356Z#00000e#00#000000 greater than snapshot 20071005135356Z#00000d#00#000000

The entries that are missing from the replicas are the ones with "greater than snapshot". I commented out the if-statement around line 1869 in syncprov.c and couldn't reproduce the problem.

Can you please try the patch in rev 1.205 of syncprov.c and see whether your test results improve? (You should remove your other change to this source file first.)

I'm fairly new to this. All I can find is the 1.205 version of syncprov.c which I can't use (compilation errors, ‘ConfigArgs’ has no member named ‘cr_msg’ etc). Should I retrieve the latest throu CVS first?

I could't find any patch.

Howard Chu

3:08 p.m.

Stelios Grigoriadis wrote:

...

Howard Chu wrote:

...
Can you please try the patch in rev 1.205 of syncprov.c and see whether your test results improve? (You should remove your other change to this source file first.)

I'm fairly new to this. All I can find is the 1.205 version of syncprov.c which I can't use (compilation errors, ‘ConfigArgs’ has no member named ‘cr_msg’ etc). Should I retrieve the latest throu CVS first?

I could't find any patch.

The patch is produced using: cvs diff -kk -r1.204 -r1.205 syncprov.c

But I've attached a copy here that should work with 2.3.

-- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/ Index: syncprov.c =================================================================== RCS file: /repo/OpenLDAP/pkg/ldap/servers/slapd/overlays/syncprov.c,v retrieving revision 1.56.2.45 diff -u -r1.56.2.45 syncprov.c --- syncprov.c 22 Jul 2007 15:24:26 -0000 1.56.2.45 +++ syncprov.c 8 Oct 2007 13:08:21 -0000 @@ -1503,7 +1503,7 @@ { struct berval maxcsn = BER_BVNULL; char cbuf[LDAP_LUTIL_CSNSTR_BUFSIZE]; - int do_check=0; + int do_check = 0, have_psearches; /* Update our context CSN */ cbuf[0] = '\0'; @@ -1548,7 +1548,10 @@ opc->sctxcsn.bv_val = cbuf; /* Handle any persistent searches */ - if ( si->si_ops ) { + ldap_pvt_thread_mutex_lock( &si->si_ops_mutex ); + have_psearches = ( si->si_ops != NULL ); + ldap_pvt_thread_mutex_unlock( &si->si_ops_mutex ); + if ( have_psearches ) { switch(op->o_tag) { case LDAP_REQ_ADD: case LDAP_REQ_MODIFY: @@ -1653,12 +1656,19 @@ { slap_overinst *on = (slap_overinst *)op->o_bd->bd_info; syncprov_info_t *si = on->on_bi.bi_private; + slap_callback *cb; + opcookie *opc; + int have_psearches, cbsize; + + ldap_pvt_thread_mutex_lock( &si->si_ops_mutex ); + have_psearches = ( si->si_ops != NULL ); + ldap_pvt_thread_mutex_unlock( &si->si_ops_mutex ); + + cbsize = sizeof(slap_callback) + sizeof(opcookie) + + (have_psearches ? sizeof(modinst) : 0 ); - slap_callback *cb = op->o_tmpcalloc(1, sizeof(slap_callback)+ - sizeof(opcookie) + - (si->si_ops ? sizeof(modinst) : 0 ), - op->o_tmpmemctx); - opcookie *opc = (opcookie *)(cb+1); + cb = op->o_tmpcalloc(1, cbsize, op->o_tmpmemctx); + opc = (opcookie *)(cb+1); opc->son = on; cb->sc_response = syncprov_op_response; cb->sc_cleanup = syncprov_op_cleanup; @@ -1669,7 +1679,7 @@ /* If there are active persistent searches, lock this operation. * See seqmod.c for the locking logic on its own. */ - if ( si->si_ops ) { + if ( have_psearches ) { modtarget *mt, mtdummy; modinst *mi; @@ -1716,7 +1726,7 @@ } } - if (( si->si_ops || si->si_logs ) && op->o_tag != LDAP_REQ_ADD ) + if (( have_psearches || si->si_logs ) && op->o_tag != LDAP_REQ_ADD ) syncprov_matchops( op, opc, 1 ); return SLAP_CB_CONTINUE;

Stelios Grigoriadis

3:47 p.m.

Howard Chu wrote:

...

Stelios Grigoriadis wrote:

...
Howard Chu wrote:

...
Can you please try the patch in rev 1.205 of syncprov.c and see whether your test results improve? (You should remove your other change to this source file first.)

I'm fairly new to this. All I can find is the 1.205 version of syncprov.c which I can't use (compilation errors, ‘ConfigArgs’ has no member named ‘cr_msg’ etc). Should I retrieve the latest throu CVS first?

I could't find any patch.

The patch is produced using: cvs diff -kk -r1.204 -r1.205 syncprov.c

But I've attached a copy here that should work with 2.3.

Didn't help. Still, the replica servers don't get synced. A lot of " xxx greater than snapshot yyy" output.

Howard Chu

6:06 p.m.

Stelios Grigoriadis wrote:

...

Howard Chu wrote:

...
Stelios Grigoriadis wrote:

...
Howard Chu wrote:

...
Can you please try the patch in rev 1.205 of syncprov.c and see whether your test results improve? (You should remove your other change to this source file first.)

I'm fairly new to this. All I can find is the 1.205 version of syncprov.c which I can't use (compilation errors, ‘ConfigArgs’ has no member named ‘cr_msg’ etc). Should I retrieve the latest throu CVS first?

I could't find any patch.

The patch is produced using: cvs diff -kk -r1.204 -r1.205 syncprov.c

But I've attached a copy here that should work with 2.3.

Didn't help. Still, the replica servers don't get synced. A lot of " xxx greater than snapshot yyy" output.

OK. I guess we go with the other approach then.

diff -u -r1.56.2.45 syncprov.c --- syncprov.c 22 Jul 2007 15:24:26 -0000 1.56.2.45 +++ syncprov.c 8 Oct 2007 16:05:54 -0000 @@ -1865,8 +1875,9 @@ a = attr_find( rs->sr_operational_attrs, slap_schema.si_ad_entryCSN ); } if ( a ) { + /* If not a persistent search */ /* Make sure entry is less than the snapshot'd contextCSN */ - if ( ber_bvcmp( &a->a_nvals[0], &ss->ss_ctxcsn ) > 0 ) { + if ( !ss->ss_so && ber_bvcmp( &a->a_nvals[0], &ss->ss_ctxcsn ) > 0 ) { Debug( LDAP_DEBUG_SYNC, "Entry %s CSN %s greater than snapshot %s\n", rs->sr_entry->e_name.bv_val, a->a_nvals[0].bv_val,

-- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Howard Chu

5 Oct 5 Oct

12:16 p.m.

Stelios Grigoriadis wrote:

...

On Fri, 2007-09-28 at 17:02 -0700, Howard Chu wrote:

...
Stelios Grigoriadis wrote:

...
But, the problem remains and I have no idea why this happens. I somehow still suspect that the problem still is in the initial phase of the sync operation (refresh stage). It might be that, some of the not-yet committed modifications don't make it into the result set in the search operation. Later after another entry is added, the "lost" entries are never to be synced over.

This also cannot be the cause. The contextCSN is snapshotted at the beginning of a refresh. Only updates between the consumer's cookie CSN and the snapshot CSN are sent to the consumer. Any entries added during this refresh will be excluded from the update, and the consumer will then record the snapshot CSN. Any entries the consumer didn't pick up in this refresh pass will be picked up in the next refresh.

...

I agree with you, I just didn't see the "next refresh" in the code. I thought it refreshed only once and then the master would write back all subsequent changes (syncprov_op_response -> syncprov_qstart etc.)

True, I was speaking of a refreshOnly case. Since you're apparently talking about refreshAndPersist, I guess the situation is different.

Looking at syncprov_search_response() around line 1868 I guess it's possible that a new entry was added (entryCSN > snapshotCSN) before the persistent search was queued up (which is at line 1991). If the Add was in progress but there were no persistent searches known at that time, then the resulting entry will not be propagated during the Persist phase. So it seems that we should make an exception in the test at 1868 and allow the entry to be sent if the current consumer is a refreshAndPersist consumer. Checking your debug logs (which I asked about in my previous message) would confirm that this is indeed the problem, and would also verify the solution.

-- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

6468

Age (days ago)

6479

Last active (days ago)

openldap-bugs@openldap.org

23 comments

4 participants

tags (0)

participants (4)

Howard Chu
Quanah Gibson-Mount
Stelios Grigoriadis
William R. Somsky