I'm trying to understand why changes made to SID 1 in my mirror set while SID 2 is down does not get propagated to SID 2 when it comes up.
I did another test, resulting in SID 2 having two contextCSNs:
root object: entryCSN: 20091204151327.370998Z#000000#001#000000 contextCSN: 20091204151327.735435Z#000000#001#000000 contextCSN: 20091204151430.680725Z#000000#002#000000
However, no changes has been made to server2. Only to server1.
No entry has the CSN with sid=002.
Is this expected?
/Peter
Peter Mogensen wrote:
I'm trying to understand why changes made to SID 1 in my mirror set while SID 2 is down does not get propagated to SID 2 when it comes up.
You've posted quite a lot of stuff on that topic already, but nothing useful that could help anyone else to see what you're actually doing. (E.g., such essentials as your software version, config files, and the actual commands/operations you perform in the precise sequence you execute them.)
I did another test, resulting in SID 2 having two contextCSNs:
root object: entryCSN: 20091204151327.370998Z#000000#001#000000 contextCSN: 20091204151327.735435Z#000000#001#000000 contextCSN: 20091204151430.680725Z#000000#002#000000
However, no changes has been made to server2. Only to server1.
No entry has the CSN with sid=002.
Is this expected?
It is not entirely unexpected...
If the last operation that occurred on that server was a Delete, then the contextCSN will be the CSN of that Delete operation, but obviously there will not be any entry in the server with that CSN.
Howard Chu wrote:
Peter Mogensen wrote:
I'm trying to understand why changes made to SID 1 in my mirror set while SID 2 is down does not get propagated to SID 2 when it comes up.
You've posted quite a lot of stuff on that topic already, but nothing useful that could help anyone else to see what you're actually doing. (E.g., such essentials as your software version, config files, and the actual commands/operations you perform in the precise sequence you execute them.)
Well... Before I posted config, I had hoped to get the precise sequence of steps (including commands) I posted which I used to load the data verified as the correct procedure. I have posted versions and command sequences. (version is currently 2.4.20 + patch for ITS#6408 and DB4.8) The config is the same as in ITS #6365 http://www.openldap.org/its/index.cgi/Incoming?id=6365
If the last operation that occurred on that server was a Delete, then the contextCSN will be the CSN of that Delete operation, but obviously there will not be any entry in the server with that CSN.
Hmm.. I did no delete, but maybe syncrepl did. Why I can't figure out given the procedure I used:
1) Took an slapcat generated LDIF from a 2.3.x setup 2) Removed all entryCSN and contextCSN lines. 3) Ran "slapadd -S 1 -q -w -l ~/load_noCSN.ldif" on server-1 4) Did a "slapcat > toserver2.ldif" on server-1 5) Started server-1 and let applications create and modify objects. 6) Moved toserver2.ldif to server-2. 7) Ran slapadd -q -l toserver2.ldif on server-2 8) Started server-2
/Peter
Peter Mogensen writes:
I'm trying to understand why changes made to SID 1 in my mirror set while SID 2 is down does not get propagated to SID 2 when it comes up.
Maybe your mirror is configured with refreshAndPersist mode and you have not specified a retry interval? Then thed default is 1 hour, according to the slapd.conf manpage.
Hallvard B Furuseth wrote:
Peter Mogensen writes:
I'm trying to understand why changes made to SID 1 in my mirror set while SID 2 is down does not get propagated to SID 2 when it comes up.
Maybe your mirror is configured with refreshAndPersist mode and you have not specified a retry interval? Then thed default is 1 hour, according to the slapd.conf manpage.
No. retry is "60 +" (1 minute from what I read).
Since no one has complained against the 8 step procedure I posted, I will assume that it is the correct way to load an huge LDIF into an empty mirrormode setup. So since, it's not the procedure, it must be either my configuration or a bug. I'll assume it's my configuration though I suspect this message is about the same problem: http://www.openldap.org/lists/openldap-software/200911/msg00058.html
So here's my configuration in a step-by-step sequence. I do:
* First install openldap 2.4.20 / db 4.8.24 on two debian Lenny systems.
* Set /etc/ldap/slapd.conf to this: =================================================================== gentlehup on pidfile /var/run/slapd/slapd.pid argsfile /var/run/slapd/slapd.args loglevel none
tool-threads 8
# Modules modulepath /usr/lib/ldap moduleload back_hdb moduleload syncprov
# Schemas include /etc/ldap/schema/core.schema include /etc/ldap/schema/cosine.schema include /etc/ldap/schema/inetorgperson.schema
# Limits disallow bind_anon idletimeout 120 sizelimit 2000
# TLS/Auth TLSCACertificateFile /etc/ldap/ssl/ca.crt TLSCertificateFile /etc/ldap/ssl/server.crt TLSCertificateKeyFile /etc/ldap/ssl/server.nopass.key TLSCipherSuite "NULL-SHA"
# Allow root to configure slapd via ldapi:/// TLSVerifyClient demand authz-regexp "gidNumber=0\+uidNumber=0,cn=peercred,cn=external,cn=auth" "cn=config"
authz-regexp "email=root@example.com,cn=config,ou=dev,o=example.com,st=Denmark,c=DK" "cn=config"
##### Mirror mode #### serverID 1
database config
limits dn.exact="cn=config" time.soft=unlimited time.hard=unlimited size.soft=unlimited size.hard=unlimited
syncrepl rid=1 provider=ldaps://server1.example.com:636/ searchbase="cn=config" type=refreshAndPersist retry="60 +" scope=sub schemachecking=on bindmethod=sasl binddn="cn=config" saslmech="EXTERNAL" tls_cert=/etc/ldap/ssl/config.crt tls_key=/etc/ldap/ssl/config.nopass.key tls_cacert=/etc/ldap/ssl/ca.crt tls_cipher_suite="NULL-SHA"
syncrepl rid=2 provider=ldaps://server2.example.com:636/ searchbase="cn=config" type=refreshAndPersist retry="60 +" scope=sub schemachecking=on bindmethod=sasl binddn="cn=config" saslmech="EXTERNAL" tls_cert=/etc/ldap/ssl/config.crt tls_key=/etc/ldap/ssl/config.nopass.key tls_cacert=/etc/ldap/ssl/ca.crt tls_cipher_suite="NULL-SHA"
overlay syncprov syncprov-checkpoint 100 10 syncprov-sessionlog 100 syncprov-reloadhint TRUE
mirrormode on =====================================================
* Then, I run slaptest -f /etc/ldap/slapd.conf -F /etc/ldap/slapd.d to convert the above to a cn=config based setup.
* Then I start slapd on both servers. $ /usr/sbin/slapd -h ldapi:/// ldaps://server1.example.com:636/ \ ldap://server1.example.com/ -g openldap -u openldap \ -F /etc/ldap/slapd.d -4
... all of the above of course different wrt. server1/server2, SID 1/2
* The I load the following LDIF files on server 1 with $ ldapadd -YEXTERNAL -H ldapi:/// -f <LDIFFILE> In sequence: ============================== dn: cn=module{0},cn=config changetype: modify add: olcModuleLoad olcModuleLoad: refint ============================= dn: cn=module{0},cn=config changetype: modify add: olcModuleLoad olcModuleLoad: back_bdb ============================== ... a bunch of schemas, like: dn: cn=evolutionperson,cn=schema,cn=config ============================== dn: olcDatabase={1}hdb,cn=config objectClass: olcHdbConfig objectClass: olcDatabaseConfig olcDatabase: hdb olcSuffix: cn=data,dc=example,dc=com olcRootDN: cn=config olcDbDirectory: /var/lib/ldap/cn=data,dc=example,dc=com olcDbMode: 0660 olcDbConfig: set_cachesize 2 0 0 olcDbConfig: set_lg_bsize 2097512 olcDbConfig: set_lg_dir /var/lib/ldap/cn=data,dc=example,dc=com-log olcDbConfig: set_flags DB_LOG_AUTOREMOVE olcDbConfig: set_lk_max_objects 5000 olcDbConfig: set_lk_max_locks 5000 olcDbConfig: set_lk_max_lockers 5000 olcDbCheckpoint: 1024 10 olcDbCachefree: 16 olcDbCachesize: 100000 olcDbIDLcacheSize: 300000 olcDbLinearIndex: FALSE olcDbIndex: objectClass eq olcDbIndex: entryUUID eq olcDbIndex: entryCSN eq olcDbIndex: cn eq,sub olcDbIndex: uid eq olcDbIndex: ou eq olcDbIndex: o eq olcDbIndex: givenName eq,sub olcDbIndex: sn eq,sub olcDbIndex: mail eq,sub olcDbIndex: member eq olcDbIndex: reader eq olcDbIndex: writer eq olcDbIndex: admin eq olcAccess: to dn.base="cn=data,dc=example,dc=com" attrs=userPassword by * auth olcAccess: to dn.base="cn=data,dc=example,dc=com" by dn.base="cn=data,dc=example,dc=com" search olcAccess: to dn.children="cn=data,dc=example,dc=com" by dn.base="cn=data,dc=example,dc=com" write olcSyncRepl: rid=3 provider=ldaps://server1.example.com:636/ searchbase="cn=data,dc=example,dc=com" type=refreshAndPersist retry="60 +" scope=sub schemachecking=on bindmethod=sasl binddn="cn=config" saslmech="EXTERNAL" tls_cert=/etc/ldap/ssl/config.crt tls_key=/etc/ldap/ssl/config.nopass.key tls_cacert=/etc/ldap/ssl/ca.crt tls_cipher_suite="NULL-SHA" olcSyncRepl: rid=4 provider=ldaps://server2.example.com:636/ searchbase="cn=data,dc=example,dc=com" type=refreshAndPersist retry="60 +" scope=sub schemachecking=on bindmethod=sasl binddn="cn=config" saslmech="EXTERNAL" tls_cert=/etc/ldap/ssl/config.crt tls_key=/etc/ldap/ssl/config.nopass.key tls_cacert=/etc/ldap/ssl/ca.crt tls_cipher_suite="NULL-SHA" olcMirrorMode: TRUE olcLimits: dn.base="cn=config" size.soft=unlimited size.hard=unlimited time.soft=unlimited time.hard=unlimited
dn: olcOverlay=syncprov,olcDatabase={1}hdb,cn=config objectClass: olcOverlayConfig objectClass: olcSyncProvConfig olcOverlay: syncprov olcSpCheckpoint: 100 600 olcSpSessionlog: 100 olcSpReloadHint: TRUE
dn: olcOverlay=refint,olcDatabase={1}hdb,cn=config objectClass: olcOverlayConfig objectClass: olcRefintConfig olcOverlay: refint olcRefintAttribute: member ===========================================
* All of the above gets properly replicated to server2.
* Then I take an LDIF from slapcat on slapd 2.3.30 and run: $ cat dump.ldif | grep -v -E '^(entryCSN:|contextCSN:)' > load_noCSN.ldif
The data (dump.ldif) looks like this (the root object): =================================== dn: cn=data,dc=example,dc=com objectClass: top objectClass: NamedObject objectClass: dcObject objectClass: simpleSecurityObject cn: data userPassword:: BASE64 structuralObjectClass: NamedObject entryUUID: ab7d5590-3e90-102c-8c03-91e70ecd3b46 creatorsName: cn=data,dc=example,dc=com modifiersName: cn=data,dc=example,dc=com createTimestamp: 20071214130312Z modifyTimestamp: 20071214130312Z entryCSN: 20071214130312Z#000000#00#000000 contextCSN: 20091118105948Z#000001#00#000000 =====================================
* Then I STOP slapd on both servers.
* Then I load the output on server1: $ slapadd -S 1 -q -w -l ~/load_noCSN.ldif
* Then I immediately slapcat this and move it to server2: $ slapcat > ~/toserver2.ldif
* And load it on server2: $ slapadd -q -l ~/toserver2.ldif
* I start server1, but BEFORE I start server2 I make ONE SINGLE CHANGE: ================= dn: cn=data,dc=example,dc=com changetype: modify replace: userPassword userPassword: NEWBASE64 =================
* THEN I start server2 and monitor it's data.
What I find is that the contextCSN from server1 gets replicated, but the change doesn't. Also I see a contextCSN on server2 with SID 002 without I've done any operations on server2.
I'm sorry, this was quite a lot. I had hoped not to throw it at the list If my procedure was wrong from the beginning,
regards, Peter
Peter Mogensen wrote:
Hallvard B Furuseth wrote:
Peter Mogensen writes:
I'm trying to understand why changes made to SID 1 in my mirror set while SID 2 is down does not get propagated to SID 2 when it comes up.
Maybe your mirror is configured with refreshAndPersist mode and you have not specified a retry interval? Then thed default is 1 hour, according to the slapd.conf manpage.
No. retry is "60 +" (1 minute from what I read).
Since no one has complained against the 8 step procedure I posted, I will assume that it is the correct way to load an huge LDIF into an empty mirrormode setup.
There is no need for your step #2.
Given a valid slapcat from OpenLDAP 2.3 you should be able to slapadd it directly in 2.4 without using -S or -w in your step #3. Therefore you don't need step #4.
So since, it's not the procedure, it must be either my configuration or a bug. I'll assume it's my configuration though I suspect this message is about the same problem: http://www.openldap.org/lists/openldap-software/200911/msg00058.html
No, that message refers to a bug that is definitely fixed in 2.4.20. (ITS#6367)
So here's my configuration in a step-by-step sequence. I do:
First install openldap 2.4.20 / db 4.8.24 on two debian Lenny systems.
Set /etc/ldap/slapd.conf to this:
=================================================================== gentlehup on pidfile /var/run/slapd/slapd.pid argsfile /var/run/slapd/slapd.args loglevel none
tool-threads 8
You have 8 CPUs?
# Modules modulepath /usr/lib/ldap moduleload back_hdb moduleload syncprov
# Schemas include /etc/ldap/schema/core.schema include /etc/ldap/schema/cosine.schema include /etc/ldap/schema/inetorgperson.schema
# Limits disallow bind_anon idletimeout 120 sizelimit 2000
# TLS/Auth TLSCACertificateFile /etc/ldap/ssl/ca.crt TLSCertificateFile /etc/ldap/ssl/server.crt TLSCertificateKeyFile /etc/ldap/ssl/server.nopass.key TLSCipherSuite "NULL-SHA"
# Allow root to configure slapd via ldapi:/// TLSVerifyClient demand authz-regexp "gidNumber=0\+uidNumber=0,cn=peercred,cn=external,cn=auth" "cn=config"
Neatness nit: your TLSVerifyClient is obviously under the wrong comment.
authz-regexp "email=root@example.com,cn=config,ou=dev,o=example.com,st=Denmark,c=DK" "cn=config"
##### Mirror mode #### serverID 1
database config
limits dn.exact="cn=config" time.soft=unlimited time.hard=unlimited size.soft=unlimited size.hard=unlimited
The rootdn is always unlimited, this clause is superfluous.
syncrepl rid=1 provider=ldaps://server1.example.com:636/ searchbase="cn=config" type=refreshAndPersist retry="60 +" scope=sub schemachecking=on bindmethod=sasl binddn="cn=config" saslmech="EXTERNAL" tls_cert=/etc/ldap/ssl/config.crt tls_key=/etc/ldap/ssl/config.nopass.key tls_cacert=/etc/ldap/ssl/ca.crt tls_cipher_suite="NULL-SHA"
syncrepl rid=2 provider=ldaps://server2.example.com:636/ searchbase="cn=config" type=refreshAndPersist retry="60 +" scope=sub schemachecking=on bindmethod=sasl binddn="cn=config" saslmech="EXTERNAL" tls_cert=/etc/ldap/ssl/config.crt tls_key=/etc/ldap/ssl/config.nopass.key tls_cacert=/etc/ldap/ssl/ca.crt tls_cipher_suite="NULL-SHA"
overlay syncprov syncprov-checkpoint 100 10 syncprov-sessionlog 100 syncprov-reloadhint TRUE
mirrormode on
Then, I run slaptest -f /etc/ldap/slapd.conf -F /etc/ldap/slapd.d to convert the above to a cn=config based setup.
Then I start slapd on both servers. $ /usr/sbin/slapd -h ldapi:/// ldaps://server1.example.com:636/ \ ldap://server1.example.com/ -g openldap -u openldap \ -F /etc/ldap/slapd.d -4
That won't work in typical Unix shells without quotes.
... all of the above of course different wrt. server1/server2, SID 1/2
- The I load the following LDIF files on server 1 with $ ldapadd -YEXTERNAL -H ldapi:/// -f<LDIFFILE> In sequence:
============================== dn: cn=module{0},cn=config changetype: modify add: olcModuleLoad olcModuleLoad: refint ============================= dn: cn=module{0},cn=config changetype: modify add: olcModuleLoad olcModuleLoad: back_bdb ==============================
Could have just used one mod request for both of those. Why are you loading back-bdb when you're just using back-hdb and it's already loaded?
... a bunch of schemas, like: dn: cn=evolutionperson,cn=schema,cn=config ============================== dn: olcDatabase={1}hdb,cn=config objectClass: olcHdbConfig objectClass: olcDatabaseConfig olcDatabase: hdb olcSuffix: cn=data,dc=example,dc=com olcRootDN: cn=config olcDbDirectory: /var/lib/ldap/cn=data,dc=example,dc=com olcDbMode: 0660 olcDbConfig: set_cachesize 2 0 0 olcDbConfig: set_lg_bsize 2097512 olcDbConfig: set_lg_dir /var/lib/ldap/cn=data,dc=example,dc=com-log olcDbConfig: set_flags DB_LOG_AUTOREMOVE olcDbConfig: set_lk_max_objects 5000 olcDbConfig: set_lk_max_locks 5000 olcDbConfig: set_lk_max_lockers 5000 olcDbCheckpoint: 1024 10 olcDbCachefree: 16 olcDbCachesize: 100000 olcDbIDLcacheSize: 300000 olcDbLinearIndex: FALSE olcDbIndex: objectClass eq olcDbIndex: entryUUID eq olcDbIndex: entryCSN eq olcDbIndex: cn eq,sub olcDbIndex: uid eq olcDbIndex: ou eq olcDbIndex: o eq olcDbIndex: givenName eq,sub olcDbIndex: sn eq,sub olcDbIndex: mail eq,sub olcDbIndex: member eq olcDbIndex: reader eq olcDbIndex: writer eq olcDbIndex: admin eq olcAccess: to dn.base="cn=data,dc=example,dc=com" attrs=userPassword by * auth olcAccess: to dn.base="cn=data,dc=example,dc=com" by dn.base="cn=data,dc=example,dc=com" search olcAccess: to dn.children="cn=data,dc=example,dc=com" by dn.base="cn=data,dc=example,dc=com" write olcSyncRepl: rid=3 provider=ldaps://server1.example.com:636/ searchbase="cn=data,dc=example,dc=com" type=refreshAndPersist retry="60 +" scope=sub schemachecking=on bindmethod=sasl binddn="cn=config" saslmech="EXTERNAL" tls_cert=/etc/ldap/ssl/config.crt tls_key=/etc/ldap/ssl/config.nopass.key tls_cacert=/etc/ldap/ssl/ca.crt tls_cipher_suite="NULL-SHA" olcSyncRepl: rid=4 provider=ldaps://server2.example.com:636/ searchbase="cn=data,dc=example,dc=com" type=refreshAndPersist retry="60 +" scope=sub schemachecking=on bindmethod=sasl binddn="cn=config" saslmech="EXTERNAL" tls_cert=/etc/ldap/ssl/config.crt tls_key=/etc/ldap/ssl/config.nopass.key tls_cacert=/etc/ldap/ssl/ca.crt tls_cipher_suite="NULL-SHA" olcMirrorMode: TRUE olcLimits: dn.base="cn=config" size.soft=unlimited size.hard=unlimited time.soft=unlimited time.hard=unlimited
dn: olcOverlay=syncprov,olcDatabase={1}hdb,cn=config objectClass: olcOverlayConfig objectClass: olcSyncProvConfig olcOverlay: syncprov olcSpCheckpoint: 100 600 olcSpSessionlog: 100 olcSpReloadHint: TRUE
dn: olcOverlay=refint,olcDatabase={1}hdb,cn=config objectClass: olcOverlayConfig objectClass: olcRefintConfig olcOverlay: refint olcRefintAttribute: member ===========================================
All of the above gets properly replicated to server2.
Then I take an LDIF from slapcat on slapd 2.3.30 and run: $ cat dump.ldif | grep -v -E '^(entryCSN:|contextCSN:)'> load_noCSN.ldif
The data (dump.ldif) looks like this (the root object):
=================================== dn: cn=data,dc=example,dc=com objectClass: top objectClass: NamedObject objectClass: dcObject objectClass: simpleSecurityObject cn: data userPassword:: BASE64 structuralObjectClass: NamedObject entryUUID: ab7d5590-3e90-102c-8c03-91e70ecd3b46 creatorsName: cn=data,dc=example,dc=com modifiersName: cn=data,dc=example,dc=com createTimestamp: 20071214130312Z modifyTimestamp: 20071214130312Z entryCSN: 20071214130312Z#000000#00#000000 contextCSN: 20091118105948Z#000001#00#000000 =====================================
Then I STOP slapd on both servers.
Then I load the output on server1: $ slapadd -S 1 -q -w -l ~/load_noCSN.ldif
Then I immediately slapcat this and move it to server2: $ slapcat> ~/toserver2.ldif
And load it on server2: $ slapadd -q -l ~/toserver2.ldif
I start server1, but BEFORE I start server2 I make ONE SINGLE CHANGE:
================= dn: cn=data,dc=example,dc=com changetype: modify replace: userPassword userPassword: NEWBASE64 =================
- THEN I start server2 and monitor it's data.
What I find is that the contextCSN from server1 gets replicated, but the change doesn't. Also I see a contextCSN on server2 with SID 002 without I've done any operations on server2.
No idea what that is. Your debug logs should tell what it was doing.
Howard Chu wrote:
There is no need for your step #2.
(step #2 is removing all entryCSN, contextCSN lines).
I did so because the SID of CSN values from the 2.3.30 dump is 00: entryCSN: 20071214130312Z#000000#00#000000
Current SID is 001 or 002.
Given a valid slapcat from OpenLDAP 2.3 you should be able to slapadd it directly in 2.4 without using -S or -w in your step #3. Therefore you don't need step #4.
Ok. Then I misunderstood your post: http://www.openldap.org/lists/openldap-technical/200911/msg00066.html
I read it as the SID of "00" from 2.3 was not a correct CSN from a 2.4 backup.
So since, it's not the procedure, it must be either my configuration or a bug. I'll assume it's my configuration though I suspect this message is about the same problem: http://www.openldap.org/lists/openldap-software/200911/msg00058.html
No, that message refers to a bug that is definitely fixed in 2.4.20. (ITS#6367)
ok.
tool-threads 8
You have 8 CPUs?
Yes. Actually - I have 8 hypertreaded CPUs mpstat -P ALL show 16 cores.
# Allow root to configure slapd via ldapi:/// TLSVerifyClient demand authz-regexp "gidNumber=0\+uidNumber=0,cn=peercred,cn=external,cn=auth" "cn=config"
Neatness nit: your TLSVerifyClient is obviously under the wrong comment.
Oh... yeah sure. I've messed around a bit after that comment was written.
limits dn.exact="cn=config" time.soft=unlimited time.hard=unlimited size.soft=unlimited size.hard=unlimited
The rootdn is always unlimited, this clause is superfluous.
Ahh.. thanks. It's a remnant from having a special user for syncrepl. I change to using rootdn to simplify ACLs.
- Then I start slapd on both servers. $ /usr/sbin/slapd -h ldapi:/// ldaps://server1.example.com:636/ \ ldap://server1.example.com/ -g openldap -u openldap \ -F /etc/ldap/slapd.d -4
That won't work in typical Unix shells without quotes.
Yeah... actually. The above text was pasted from "ps ax" output. The server is started from /etc/init.d/slapd and options come from /etc/default/slapd.
- The I load the following LDIF files on server 1 with $ ldapadd -YEXTERNAL -H ldapi:/// -f<LDIFFILE> In sequence:
============================== dn: cn=module{0},cn=config changetype: modify add: olcModuleLoad olcModuleLoad: refint ============================= dn: cn=module{0},cn=config changetype: modify add: olcModuleLoad olcModuleLoad: back_bdb ==============================
Could have just used one mod request for both of those. Why are you loading back-bdb when you're just using back-hdb and it's already loaded?
Because I've tried to reproduce the problem with both back_hdb and back_bdb. I first used back_hdb (hence configured in slapd.conf), then switched to back_bdb, which is why I added the above ModuleLoad LDIF, reproduced the problem and switched back to HDB. Of course, now the last bit above is redundant.
No idea what that is. Your debug logs should tell what it was doing.
I've tried a lot of loglevels and look for anything suspicious. I noticed (as I mentioned) this: bdb_index_read: failed (-30989)
... and another thing I find weird. The last entry in the LDIF is special in the log, like:
======================================================= Dec 4 14:14:30 server1 slapd[5433]: <= bdb_dn2id_children("cn=me,ou=3,uid=apm,o=net,cn=data,dc=example,dc=com"):no (-30989) Dec 4 14:14:30 server1 slapd[5433]: Entry cn=me,ou=3,uid=apm,o=net,cn=data,dc=example,dc=com changed by peer, ignored Dec 4 14:14:30 server1 slapd[5433]: send_ldap_result: conn=1004 op=1 p=3 Dec 4 14:14:30 server1 slapd[5433]: send_ldap_result: err=0 matched="" text="" Dec 4 14:14:30 server1 slapd[5433]: syncprov_search_response: cookie=rid=003,sid=001,csn=20091204141336.982142Z#000000#001#000000 ========================================================
The "ignored" message was let me to suspect the message you say has been fixed as ITS#6367
/Peter
Peter Mogensen wrote:
Howard Chu wrote:
There is no need for your step #2.
(step #2 is removing all entryCSN, contextCSN lines).
I did so because the SID of CSN values from the 2.3.30 dump is 00: entryCSN: 20071214130312Z#000000#00#000000
Current SID is 001 or 002.
Given a valid slapcat from OpenLDAP 2.3 you should be able to slapadd it directly in 2.4 without using -S or -w in your step #3. Therefore you don't need step #4.
Ok. Then I misunderstood your post: http://www.openldap.org/lists/openldap-technical/200911/msg00066.html
I read it as the SID of "00" from 2.3 was not a correct CSN from a 2.4 backup.
It's fine. OpenLDAP 2.4 will accept CSNs from all previous OpenLDAP releases. You only need to worry about these steps if there are no CSNs at all, or they came from some non-OpenLDAP software.
No idea what that is. Your debug logs should tell what it was doing.
I've tried a lot of loglevels and look for anything suspicious. I noticed (as I mentioned) this: bdb_index_read: failed (-30989)
... and another thing I find weird. The last entry in the LDIF is special in the log, like:
======================================================= Dec 4 14:14:30 server1 slapd[5433]:<= bdb_dn2id_children("cn=me,ou=3,uid=apm,o=net,cn=data,dc=example,dc=com"):no (-30989) Dec 4 14:14:30 server1 slapd[5433]: Entry cn=me,ou=3,uid=apm,o=net,cn=data,dc=example,dc=com changed by peer, ignored Dec 4 14:14:30 server1 slapd[5433]: send_ldap_result: conn=1004 op=1 p=3 Dec 4 14:14:30 server1 slapd[5433]: send_ldap_result: err=0 matched="" text="" Dec 4 14:14:30 server1 slapd[5433]: syncprov_search_response: cookie=rid=003,sid=001,csn=20091204141336.982142Z#000000#001#000000 ========================================================
Sounds like a change was written on server2 and received on server1. The above message just means that server1 is not going to try to send the same change back to server2. (I.e., perfectly normal.) So the question is, why are you saying that no writes have occurred on server2, when the logs and contextCSNs show otherwise?
Howard Chu wrote:
Sounds like a change was written on server2 and received on server1. The above message just means that server1 is not going to try to send the same change back to server2. (I.e., perfectly normal.) So the question is, why are you saying that no writes have occurred on server2, when the logs and contextCSNs show otherwise?
Good question. Because I really don't do any changes on server2. But I can't say if syncrepl do.
Basicly, I load the data on server1 AND server2, start server1, make changes to server1 and THEN start server2. Besides the one slapadd I do to load the LDIF from server1 the only operations I do on server2 is ldapsearch and slapcat.
/Peter
--On Saturday, December 05, 2009 12:46 AM +0100 Peter Mogensen apm@mutex.dk wrote:
Howard Chu wrote:
Sounds like a change was written on server2 and received on server1. The above message just means that server1 is not going to try to send the same change back to server2. (I.e., perfectly normal.) So the question is, why are you saying that no writes have occurred on server2, when the logs and contextCSNs show otherwise?
Good question. Because I really don't do any changes on server2. But I can't say if syncrepl do.
Basicly, I load the data on server1 AND server2, start server1, make changes to server1 and THEN start server2. Besides the one slapadd I do to load the LDIF from server1 the only operations I do on server2 is ldapsearch and slapcat.
By "writes" that would include anything changed by syncrepl. I think Howard's point is that server2 says it has received writes aka changes from server1. So the question is, what makes you think that server2 hasn't gotten said writes when it is reporting that it is?
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
Quanah Gibson-Mount wrote:
By "writes" that would include anything changed by syncrepl. I think Howard's point is that server2 says it has received writes aka changes from server1. So the question is, what makes you think that server2 hasn't gotten said writes when it is reporting that it is?
As I said. server2 does get the updated contextCSN from server1, but only the contectCSN. The data changes I make are not propagated.
/Peter
Peter Mogensen wrote:
Quanah Gibson-Mount wrote:
By "writes" that would include anything changed by syncrepl. I think Howard's point is that server2 says it has received writes aka changes from server1. So the question is, what makes you think that server2 hasn't gotten said writes when it is reporting that it is?
As I said. server2 does get the updated contextCSN from server1, but only the contectCSN. The data changes I make are not propagated.
I was about to give up, but I tcpdumped to communication to confirm that the data was indeed not sent and switched all loglevel options on at server1 and still the only interesting thing I could se was the "changed by peer, ignored" message.
Since it was not ITS#6367 (I use 2.4.20) I had to do some googling. I found this thread: http://www.openldap.org/lists/openldap-software/200810/msg00116.html
I realized that I had actually once noticed that server2 had gotten af wrong ServerID, but I figured I had made a typo.
However, it seems that the example in the documentation regarding mirrormode setup is only valid if you DO NOT mirror cn=config
If you also mirror cn=config, you'll have to provide both ServerIDs on both server:
ServerID 1 ldaps://server1.example.com/ ServerID 2 ldaps://server2.example.com/
... and then it works.
I might have been able to figure this out earlier, but may I suggest that this is mentioned in the next revision of the mirrormode docs?
Thanks for your time, /Peter
--On Saturday, December 05, 2009 6:21 PM +0100 Peter Mogensen apm@mutex.dk wrote:
I might have been able to figure this out earlier, but may I suggest that this is mentioned in the next revision of the mirrormode docs?
If you find a bug with the documentation, then please file an ITS at http://www.openldap.org/its/ so it can be tracked and fixed. Give as much detail as sufficient to fully describe the issue with the documentation that you have found.
Thanks, Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
openldap-technical@openldap.org