Full_Name: Stephan Version: 2.4.26 OS: Debian 5.0.8 URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (134.100.101.1)
Hi everybody,
I have an odd problem with synchronisation. I have 5 nodes running multi master synchronisation. I recently updated one of them to 2.4.26 (they were all running 2.4.23 before) and started it with an empty database (the database directory contained nothing more than the DB_CONFIG file). Afterwards I noted that some attributes were missing, especially in the root of the directory tree, which looks like this:
dn: o=test objectClass: organization o: test
After the sync was complete, I could search for the o=test entry but couldn't see it:
ldapsearch -H "ldaps://filled.test.xyz" -D cn=superman,o=test -W -x -b o=test o=test
# extended LDIF # # LDAPv3 # base <o=test> with scope subtree # filter: o=test # requesting: ALL #
# search result search: 2 result: 0 Success
# numResponses: 1
I played around a bit and am able to reproduce the problem. I tried combinations of 2.4.23 and 2.4.26, bdb and hdb backends, bdb versions 4.5.20-13 and 4.6.21 but the problem persists. If I take the output of slapcat on one of the nodes and feed it to the empty node before its startup everything seems to be ok.
I increased the loglevel and can see the following (rid 1 being the node with the filled database and rid 2 the one with the empty database):
On the empty node (rid2) I get lots of 'syncrepl_entry: rid=001 inserted UUID xxx' 'syncprov_matchops: skipping original sid 001' and 'syncrepl_entry: rid=001 be_add' Once the sync is complete the log on the formerly empty node says:
Aug 1 17:24:55 empty slapd[7796]: do_syncrep2: rid=001 LDAP_RES_INTERMEDIATE - REFRESH_DELETE Aug 1 17:24:55 empty slapd[7796]: do_syncrep2: rid=001 cookie=rid=001,sid=001,csn=20110310162449.396663Z#000000#000#000000;20110801135804.526931Z#000000#001#000000;20110614092017.772141Z#000000#003#000000;20110722131433.793594Z#000000#004#000000;20110225131904.168266Z#000000#005#000000 Aug 1 17:24:55 empty slapd[7796]: slap_queue_csn: queing 0x9f901da8 20110801135804.526931Z#000000#001#000000 Aug 1 17:24:55 empty slapd[7796]: syncprov_matchops: skipping original sid 001 Aug 1 17:24:55 empty slapd[7796]: slap_graduate_commit_csn: removing 0x9f91f5f8 20110801135804.526931Z#000000#001#000000 Aug 1 17:24:55 empty slapd[7796]: do_syncrep2: rid=001 cookie= Aug 1 17:24:55 empty slapd[7796]: syncrepl_entry: rid=001 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_MODIFY) Aug 1 17:24:55 empty slapd[7796]: syncrepl_entry: rid=001 be_search (0) Aug 1 17:24:55 empty slapd[7796]: syncrepl_entry: rid=001 o=test Aug 1 17:24:55 empty slapd[7796]: syncrepl_entry: rid=001 entry unchanged, ignored (o=test) Aug 1 17:24:55 empty slapd[7796]: do_syncrep2: rid=001 LDAP_RES_INTERMEDIATE - NEW_COOKIE Aug 1 17:24:55 empty slapd[7796]: do_syncrep2: rid=001 NEW_COOKIE: rid=001,sid=001,csn=20110801152321.168164Z#000000#000#000000;20110801135804.526931Z#000000#001#000000;20110614092017.772141Z#000000#003#000000;20110722131433.793594Z#000000#004#000000;20110225131904.168266Z#000000#005#000000
On the node from which it took the database (rid 1):
Aug 1 17:24:54 filled slapd[17862]: do_syncrep2: rid=002 LDAP_RES_INTERMEDIATE - SYNC_ID_SET Aug 1 17:24:54 filled last message repeated 7 times Aug 1 17:24:54 filled slapd[17862]: syncrepl_entry: rid=002 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD) Aug 1 17:24:54 filled slapd[17862]: syncrepl_entry: rid=002 be_search (0) Aug 1 17:24:54 filled slapd[17862]: syncrepl_entry: rid=002 o=test Aug 1 17:24:54 filled slapd[17862]: syncprov_sendresp: cookie= Aug 1 17:24:54 filled slapd[17862]: syncrepl_entry: rid=002 be_modify o=test (0) Aug 1 17:24:54 filled slapd[17862]: syncrepl_entry: rid=002 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD) Aug 1 17:24:54 filled slapd[17862]: dn_callback : entries have identical CSN sambaDomainName=XYZ,ou=abc,o=test 20110722131010.624524Z#000000#004#000000
Aug 1 17:24:54 filled slapd[17862]: syncrepl_entry: rid=002 be_search (0) Aug 1 17:24:54 filled slapd[17862]: syncrepl_entry: rid=002 sambaDomainName=XYZ,ou=abc,o=test Aug 1 17:24:54 filled slapd[17862]: syncrepl_entry: rid=002 entry unchanged, ignored (sambaDomainName=XYZ,ou=abc,o=test) Aug 1 17:24:54 filled slapd[17862]: do_syncrep2: rid=002 LDAP_RES_INTERMEDIATE - REFRESH_PRESENT Aug 1 17:24:54 filled slapd[17862]: do_syncrep2: rid=002 cookie=rid=002,csn=20110801152321.168164Z#000000#000#000000 Aug 1 17:24:54 filled slapd[17862]: slap_queue_csn: queing 0x7eff600022c0 20110801152321.168164Z#000000#000#000000 Aug 1 17:24:54 filled slapd[17862]: slap_graduate_commit_csn: removing 0x7eff60007040 20110801152321.168164Z#000000#000#000000
It seems to me that the formerly empty node overwrites the entry on the node from which it took all the entries.
This is what my config looks like:
overlay syncprov syncprov-checkpoint 100 10 syncprov-sessionlog 100 syncprov-reloadhint TRUE
serverID 1 "ldaps://filled.test.xyz" serverID 2 "ldaps://empty.test.xyz"
#filled syncrepl rid=2 provider=ldaps://filled.test.xyz:636 type=refreshAndPersist searchbase="o=test" retry="120 20 300 5 1800 5 3600 +" scope=sub bindmethod=simple binddn="cn=superman,o=test" credentials="hello" tls_reqcert=never mirrormode on
And the other node: overlay syncprov syncprov-checkpoint 100 10 syncprov-sessionlog 100 syncprov-reloadhint TRUE
serverID 1 "ldaps://filled.test.xyz" serverID 2 "ldaps://empty.test.xyz"
#filled syncrepl rid=2 provider=ldaps://empty.test.xyz:636 type=refreshAndPersist searchbase="o=test" retry="120 20 300 5 1800 5 3600 +" scope=sub bindmethod=simple binddn="cn=superman,o=test" credentials="hello" tls_reqcert=never mirrormode on
It seems to me that the formerly empty node overwrites the other one after the sync.
In some cases the entries don't go missing on the source node, they just don't get replicated to the formerly emtpy node.
Unless I made some terrible mistake in the config this is seems to be a bug since in the administrator's guide http://www.openldap.org/doc/admin24/replication.html#N-Way%20Multi-Master%20... it says: "Syncrepl is self-synchronizing; you can start with a consumer database in any state from totally empty to fully synced and it will automatically do the right thing to achieve and maintain synchronization"
All nodes are configured to use the same ntp source.