Full_Name: Jonathan Clarke Version: 2.3.38 and HEAD (but with slightly different results) OS: Linux URL: ftp://ftp.openldap.org/incoming/unwanted-deletes-syncrepl-glue.tar.gz Submission from: (NULL) (213.41.243.192)
Hi folks,
I've come across an issue with a server using the glue overlay with one of it's subordinate databases syncrepl'd. There are two problems: 1) updating the root database's contextCSN 2) when replicating this whole server with syncrepl (to a 3rd server), certain updates cause many entries to be deleted from the consumer.
The following "schema" should describe this setup more clearly (names TOP, MIDDLE and BOTTOM are for easy reference):
TOP: |----------------------------| | One bdb backend: | | dc=ossa,dc=linagora,dc=org | |----------------------------| | MIDDLE: | |-------------------------------| | Two bdb backends + glue: | | 1) dc=ossa,dc=linagora,dc=org | | subordinate | | syncrepl from above server | | 2) dc=linagora,dc=org | | 'master' for this branch | | | | syncprov overlay | |-------------------------------| | BOTTOM: | |-------------------------------| | One bdb backend: | | dc=linagora,dc=org | | syncrepl from above server | |-------------------------------|
All config files, and some sample data sets, are in the archive at the URL above.
I have tested this on both 2.3.38 and HEAD (same version on all 3 servers), and behaviour is quite different, though the end result is the same.
On 2.3.38: 1) Set up all three servers, make sure they're sync'ed.
2) Modify some attribute on the TOP server (I add a description to the root DN, dc=ossa,dc=linagora,dc=org)
3) Watch this modification propagate to the middle server. the contextCSN in dc=linagora,dc=org is not updated, but the one in dc=ossa,dc=linagora,dc=org is (equals to the entryCSN of the entry I modified). The output is the following with loglevel=stats+sync: 8>--------------------------------------------------------------- request done: ld 0x822ec20 msgid 1 do_syncrep2: rid 007 LDAP_RES_INTERMEDIATE - SYNC_ID_SET syncrepl_entry: rid 007 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD) syncrepl_entry: rid 007 be_search (0) syncrepl_entry: rid 007 dc=ossa,dc=linagora,dc=org syncrepl_entry: rid 007 be_modify (0) request done: ld 0x822ec20 msgid 2 do_syncrep2: rid 007 LDAP_RES_SEARCH_RESULT 8>---------------------------------------------------------------
4) Watch the BOTTOM server (see schema above) do it's syncrepl and delete some entries below the glued database (glued on MIDDLE server, not on this one). The output is the following with loglevel=stats+sync: 8>--------------------------------------------------------------- request done: ld 0x822a320 msgid 1 request done: ld 0x822a320 msgid 2 do_syncrep2: rid 888 LDAP_RES_SEARCH_RESULT request done: ld 0x822a320 msgid 1 do_syncrep2: rid 888 LDAP_RES_INTERMEDIATE - SYNC_ID_SET syncrepl_entry: rid 888 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD) syncrepl_entry: rid 888 be_search (0) syncrepl_entry: rid 888 dc=linagora,dc=org syncrepl_entry: rid 888 be_modify (0) syncrepl_entry: rid 888 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD) syncrepl_entry: rid 888 be_search (0) syncrepl_entry: rid 888 dc=ossa,dc=linagora,dc=org syncrepl_entry: rid 888 be_modify (0) request done: ld 0x822a320 msgid 2 do_syncrep2: rid 888 LDAP_RES_SEARCH_RESULT syncrepl_del_nonpresent: rid 888 be_delete uid=replicator,dc=ossa,dc=linagora,dc=org (0) syncrepl_del_nonpresent: rid 888 be_delete uid=root,dc=ossa,dc=linagora,dc=org (0) 8>---------------------------------------------------------------
On HEAD, things are quite different: 1) Start the TOP server. 2) Start the MIDDLE server. Errors happen immediatly, on first sync attempt. The output is the following with loglevel=stats+sync: 8>--------------------------------------------------------------- slapd starting request done: ld 0x82c2228 msgid 1 syncrepl_entry: rid=7 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD) syncrepl_entry: rid=7 inserted UUID b5797e8c-0486-102c-83e0-79137da6179f syncrepl_entry: rid=7 be_search (32) syncrepl_entry: rid=7 dc=ossa,dc=linagora,dc=org syncrepl_entry: rid=7 be_add (0) syncrepl_entry: rid=7 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD) syncrepl_entry: rid=7 inserted UUID b5a2f834-0486-102c-83e1-79137da6179f syncrepl_entry: rid=7 be_search (0) syncrepl_entry: rid=7 uid=root,dc=ossa,dc=linagora,dc=org syncrepl_entry: rid=7 be_add (0) syncrepl_entry: rid=7 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD) syncrepl_entry: rid=7 inserted UUID b5a31526-0486-102c-83e2-79137da6179f syncrepl_entry: rid=7 be_search (0) syncrepl_entry: rid=7 uid=replicator,dc=ossa,dc=linagora,dc=org syncrepl_entry: rid=7 be_add (0) request done: ld 0x82c2228 msgid 2 do_syncrep2: rid=7 LDAP_RES_SEARCH_RESULT nonpresent_callback: rid=7 got UUID b5797e8c-0486-102c-83e0-79137da6179f, dn dc=ossa,dc=linagora,dc=org nonpresent_callback: rid=7 got UUID b5a2f834-0486-102c-83e1-79137da6179f, dn uid=root,dc=ossa,dc=linagora,dc=org nonpresent_callback: rid=7 got UUID b5a31526-0486-102c-83e2-79137da6179f, dn uid=replicator,dc=ossa,dc=linagora,dc=org adresse de be_modify : 80c8840 null_callback : error code 0x32 syncrepl_updateCookie: rid=7 be_modify failed (50) request done: ld 0x82c2228 msgid 1 syncrepl_entry: rid=7 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD) syncrepl_entry: rid=7 inserted UUID b5797e8c-0486-102c-83e0-79137da6179f dn_callback : entries have identical CSN dc=ossa,dc=linagora,dc=org ours 20071001162546.703481Z#000000#000#000000, new 20071001162546.703481Z#000000#000#000000 syncrepl_entry: rid=7 be_search (0) syncrepl_entry: rid=7 dc=ossa,dc=linagora,dc=org syncrepl_entry: rid=7 entry unchanged, ignored (dc=ossa,dc=linagora,dc=org) syncrepl_entry: rid=7 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD) syncrepl_entry: rid=7 inserted UUID b5a2f834-0486-102c-83e1-79137da6179f dn_callback : entries have identical CSN uid=root,dc=ossa,dc=linagora,dc=org ours 20071001162546.975377Z#000000#000#000000, new 20071001162546.975377Z#000000#000#000000 syncrepl_entry: rid=7 be_search (0) syncrepl_entry: rid=7 uid=root,dc=ossa,dc=linagora,dc=org syncrepl_entry: rid=7 entry unchanged, ignored (uid=root,dc=ossa,dc=linagora,dc=org) syncrepl_entry: rid=7 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD) syncrepl_entry: rid=7 inserted UUID b5a31526-0486-102c-83e2-79137da6179f dn_callback : entries have identical CSN uid=replicator,dc=ossa,dc=linagora,dc=org ours 20071001162546.976133Z#000000#000#000000, new 20071001162546.976133Z#000000#000#000000 syncrepl_entry: rid=7 be_search (0) syncrepl_entry: rid=7 uid=replicator,dc=ossa,dc=linagora,dc=org syncrepl_entry: rid=7 entry unchanged, ignored (uid=replicator,dc=ossa,dc=linagora,dc=org) 8>---------------------------------------------------------------
Obviously, the desired result is that entries are not deleted from the BOTTOM server when replication happens. I'm a bit at a loss as to the logic behind these updates, and how to go about correcting.
I tried applying a patch backported from HEAD to 2.3.38 that makes syncrepl update the contextCSN in the real root (not the bdb database root). It works in that the contextCSN is updated correctly, but replication to BOTTOM still has unwanted deletes. The patch is in the archive attached (update-root-contextCSN.diff) and corresponds to revisions 1.308 and 1.309 in syncrepl.c CVS log.
I am completely available to provide any more information necessary: logs, testing, gdb output, etc. Any help or pointers most welcome!
Thanks in advance, Jon