Hi,
I've setup a multimaster cluster composed of two machine (in my example 192.168.0.204 and 192.168.0.197). Everything is working ok and both side are replicating ok.
However, I've a problem I'd like to submit to your sagacity.
When I put down a server, and modify the other server (delete or add), when the first server comes back, the modifications are not pushed in the old server. Server 1 says Entry cn=seb,ou=orgunit,o=org,dc=example,dc=com changed by peer, ignored
Adding new entries works ok and synchronisation happens but for the nodes altered while one of the servers was down, the modifications are lost (or more precisely ignored by the other).
My questions: Is this normal behaviour (Maybe I got the configuration wrong) ? How may I force the missing entries to be replicated to the other ? (Only solution I found is to wipe the entire database on the down server that force a replication from its peer).
sincerely, Seb
Here an extract of the slapd.log from server 1: Sep 22 10:12:32 dhcp204 slapd[2689]: do_syncrep2: rid=002 (-1) Can't contact LDAP server Sep 22 10:12:32 dhcp204 slapd[2689]: do_syncrepl: rid=002 rc -1 retrying (4 retries left) Sep 22 10:12:32 dhcp204 slapd[2689]: do_syncrep2: rid=004 (-1) Can't contact LDAP server Sep 22 10:12:32 dhcp204 slapd[2689]: do_syncrepl: rid=004 rc -1 retrying (4 retries left) Sep 22 10:12:32 dhcp204 slapd[2689]: conn=1002 op=2 UNBIND Sep 22 10:12:32 dhcp204 slapd[2689]: conn=1002 fd=19 closed Sep 22 10:12:32 dhcp204 slapd[2689]: conn=1003 op=2 UNBIND Sep 22 10:12:32 dhcp204 slapd[2689]: conn=1003 fd=20 closed Sep 22 10:12:37 dhcp204 slapd[2689]: slap_client_connect: URI=ldap://192.168.0.197 DN="cn=config" ldap_sasl_bind_s failed (-1) Sep 22 10:12:37 dhcp204 slapd[2689]: do_syncrepl: rid=002 rc -1 retrying (3 retries left) Sep 22 10:12:37 dhcp204 slapd[2689]: slap_client_connect: URI=ldap://192.168.0.197 DN="cn=manager,dc=example,dc=com" ldap_sasl_bind_s failed (-1) Sep 22 10:12:37 dhcp204 slapd[2689]: do_syncrepl: rid=004 rc -1 retrying (3 retries left)
Sep 22 10:12:42 dhcp204 slapd[2689]: slap_client_connect: URI=ldap://192.168.0.197 DN="cn=manager,dc=example,dc=com" ldap_sasl_bind_s failed (-1) Sep 22 10:12:42 dhcp204 slapd[2689]: do_syncrepl: rid=004 rc -1 retrying (2 retries left) Sep 22 10:12:42 dhcp204 slapd[2689]: slap_client_connect: URI=ldap://192.168.0.197 DN="cn=config" ldap_sasl_bind_s failed (-1) Sep 22 10:12:42 dhcp204 slapd[2689]: do_syncrepl: rid=002 rc -1 retrying (2 retries left) Sep 22 10:12:47 dhcp204 slapd[2689]: slap_client_connect: URI=ldap://192.168.0.197 DN="cn=manager,dc=example,dc=com" ldap_sasl_bind_s failed (-1) Sep 22 10:12:47 dhcp204 slapd[2689]: do_syncrepl: rid=004 rc -1 retrying (1 retries left) Sep 22 10:12:47 dhcp204 slapd[2689]: slap_client_connect: URI=ldap://192.168.0.197 DN="cn=config" ldap_sasl_bind_s failed (-1) Sep 22 10:12:47 dhcp204 slapd[2689]: do_syncrepl: rid=002 rc -1 retrying (1 retries left) Sep 22 10:12:52 dhcp204 slapd[2689]: slap_client_connect: URI=ldap://192.168.0.197 DN="cn=config" ldap_sasl_bind_s failed (-1) Sep 22 10:12:52 dhcp204 slapd[2689]: do_syncrepl: rid=002 rc -1 retrying Sep 22 10:12:52 dhcp204 slapd[2689]: slap_client_connect: URI=ldap://192.168.0.197 DN="cn=manager,dc=example,dc=com" ldap_sasl_bind_s failed (-1) Sep 22 10:12:52 dhcp204 slapd[2689]: do_syncrepl: rid=004 rc -1 retrying Sep 22 10:12:57 dhcp204 slapd[2689]: slap_client_connect: URI=ldap://192.168.0.197 DN="cn=config" ldap_sasl_bind_s failed (-1) Sep 22 10:12:57 dhcp204 slapd[2689]: do_syncrepl: rid=002 rc -1 retrying (4 retries left) Sep 22 10:12:57 dhcp204 slapd[2689]: slap_client_connect: URI=ldap://192.168.0.197 DN="cn=manager,dc=example,dc=com" ldap_sasl_bind_s failed (-1) Sep 22 10:12:57 dhcp204 slapd[2689]: do_syncrepl: rid=004 rc -1 retrying (4 retries left) Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1007 fd=13 ACCEPT from IP=192.168.0.197:55471 (IP=0.0.0.0:389) Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1007 op=0 BIND dn="cn=manager,dc=example,dc=com" method=128 Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1007 op=0 BIND dn="cn=manager,dc=example,dc=com" mech=SIMPLE ssf=0 Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1007 op=0 RESULT tag=97 err=0 text= Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1008 fd=15 ACCEPT from IP=192.168.0.197:55473 (IP=0.0.0.0:389) Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1008 op=0 BIND dn="cn=config" method=128 Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1008 op=0 BIND dn="cn=config" mech=SIMPLE ssf=0 Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1008 op=0 RESULT tag=97 err=0 text= Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1008 op=1 SRCH base="cn=config" scope=2 deref=0 filter="(objectClass=*)" Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1008 op=1 SRCH attr=* + Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1008 op=1 INTERM oid=1.3.6.1.4.1.4203.1.9.1.4 Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1007 op=1 SRCH base="dc=example,dc=com" scope=2 deref=0 filter="(objectClass=*)" Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1007 op=1 SRCH attr=* + Sep 22 10:13:02 dhcp204 slapd[2689]: srs csn 20110922081225.199039Z#000000#000#000000 Sep 22 10:13:02 dhcp204 slapd[2689]: log csn 20110922081225.199039Z#000000#000#000000 Sep 22 10:13:02 dhcp204 slapd[2689]: cmp 0, too old Sep 22 10:13:02 dhcp204 slapd[2689]: Entry cn=seb,ou=orgunit,o=org,dc=example,dc=com changed by peer, ignored Sep 22 10:13:02 dhcp204 slapd[2689]: syncprov_search_response: cookie=rid=003,csn=20110922081235.611410Z#000000#000#000000 Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1007 op=1 INTERM oid=1.3.6.1.4.1.4203.1.9.1.4 Sep 22 10:17:57 dhcp204 slapd[2689]: do_syncrep2: rid=002 LDAP_RES_INTERMEDIATE - REFRESH_DELETE Sep 22 10:17:57 dhcp204 slapd[2689]: do_syncrep2: rid=004 LDAP_RES_INTERMEDIATE - REFRESH_DELET
Sébastien Bernard wrote:
Hi,
I've setup a multimaster cluster composed of two machine (in my example 192.168.0.204 and 192.168.0.197). Everything is working ok and both side are replicating ok.
However, I've a problem I'd like to submit to your sagacity.
When I put down a server, and modify the other server (delete or add), when the first server comes back, the modifications are not pushed in the old server. Server 1 says Entry cn=seb,ou=orgunit,o=org,dc=example,dc=com changed by peer, ignored
You have not provided enough useful information (OpenLDAP version, exact server configurations, which one is "server 1" in your description) to be certain. But most likely you have not configured their ServerIDs correctly.
Adding new entries works ok and synchronisation happens but for the nodes altered while one of the servers was down, the modifications are lost (or more precisely ignored by the other).
My questions: Is this normal behaviour (Maybe I got the configuration wrong) ? How may I force the missing entries to be replicated to the other ? (Only solution I found is to wipe the entire database on the down server that force a replication from its peer).
sincerely, Seb
Here an extract of the slapd.log from server 1: Sep 22 10:12:32 dhcp204 slapd[2689]: do_syncrep2: rid=002 (-1) Can't contact LDAP server Sep 22 10:12:32 dhcp204 slapd[2689]: do_syncrepl: rid=002 rc -1 retrying (4 retries left) Sep 22 10:12:32 dhcp204 slapd[2689]: do_syncrep2: rid=004 (-1) Can't contact LDAP server Sep 22 10:12:32 dhcp204 slapd[2689]: do_syncrepl: rid=004 rc -1 retrying (4 retries left) Sep 22 10:12:32 dhcp204 slapd[2689]: conn=1002 op=2 UNBIND Sep 22 10:12:32 dhcp204 slapd[2689]: conn=1002 fd=19 closed Sep 22 10:12:32 dhcp204 slapd[2689]: conn=1003 op=2 UNBIND Sep 22 10:12:32 dhcp204 slapd[2689]: conn=1003 fd=20 closed Sep 22 10:12:37 dhcp204 slapd[2689]: slap_client_connect: URI=ldap://192.168.0.197 DN="cn=config" ldap_sasl_bind_s failed (-1) Sep 22 10:12:37 dhcp204 slapd[2689]: do_syncrepl: rid=002 rc -1 retrying (3 retries left) Sep 22 10:12:37 dhcp204 slapd[2689]: slap_client_connect: URI=ldap://192.168.0.197 DN="cn=manager,dc=example,dc=com" ldap_sasl_bind_s failed (-1) Sep 22 10:12:37 dhcp204 slapd[2689]: do_syncrepl: rid=004 rc -1 retrying (3 retries left)
Sep 22 10:12:42 dhcp204 slapd[2689]: slap_client_connect: URI=ldap://192.168.0.197 DN="cn=manager,dc=example,dc=com" ldap_sasl_bind_s failed (-1) Sep 22 10:12:42 dhcp204 slapd[2689]: do_syncrepl: rid=004 rc -1 retrying (2 retries left) Sep 22 10:12:42 dhcp204 slapd[2689]: slap_client_connect: URI=ldap://192.168.0.197 DN="cn=config" ldap_sasl_bind_s failed (-1) Sep 22 10:12:42 dhcp204 slapd[2689]: do_syncrepl: rid=002 rc -1 retrying (2 retries left) Sep 22 10:12:47 dhcp204 slapd[2689]: slap_client_connect: URI=ldap://192.168.0.197 DN="cn=manager,dc=example,dc=com" ldap_sasl_bind_s failed (-1) Sep 22 10:12:47 dhcp204 slapd[2689]: do_syncrepl: rid=004 rc -1 retrying (1 retries left) Sep 22 10:12:47 dhcp204 slapd[2689]: slap_client_connect: URI=ldap://192.168.0.197 DN="cn=config" ldap_sasl_bind_s failed (-1) Sep 22 10:12:47 dhcp204 slapd[2689]: do_syncrepl: rid=002 rc -1 retrying (1 retries left) Sep 22 10:12:52 dhcp204 slapd[2689]: slap_client_connect: URI=ldap://192.168.0.197 DN="cn=config" ldap_sasl_bind_s failed (-1) Sep 22 10:12:52 dhcp204 slapd[2689]: do_syncrepl: rid=002 rc -1 retrying Sep 22 10:12:52 dhcp204 slapd[2689]: slap_client_connect: URI=ldap://192.168.0.197 DN="cn=manager,dc=example,dc=com" ldap_sasl_bind_s failed (-1) Sep 22 10:12:52 dhcp204 slapd[2689]: do_syncrepl: rid=004 rc -1 retrying Sep 22 10:12:57 dhcp204 slapd[2689]: slap_client_connect: URI=ldap://192.168.0.197 DN="cn=config" ldap_sasl_bind_s failed (-1) Sep 22 10:12:57 dhcp204 slapd[2689]: do_syncrepl: rid=002 rc -1 retrying (4 retries left) Sep 22 10:12:57 dhcp204 slapd[2689]: slap_client_connect: URI=ldap://192.168.0.197 DN="cn=manager,dc=example,dc=com" ldap_sasl_bind_s failed (-1) Sep 22 10:12:57 dhcp204 slapd[2689]: do_syncrepl: rid=004 rc -1 retrying (4 retries left) Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1007 fd=13 ACCEPT from IP=192.168.0.197:55471 (IP=0.0.0.0:389) Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1007 op=0 BIND dn="cn=manager,dc=example,dc=com" method=128 Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1007 op=0 BIND dn="cn=manager,dc=example,dc=com" mech=SIMPLE ssf=0 Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1007 op=0 RESULT tag=97 err=0 text= Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1008 fd=15 ACCEPT from IP=192.168.0.197:55473 (IP=0.0.0.0:389) Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1008 op=0 BIND dn="cn=config" method=128 Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1008 op=0 BIND dn="cn=config" mech=SIMPLE ssf=0 Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1008 op=0 RESULT tag=97 err=0 text= Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1008 op=1 SRCH base="cn=config" scope=2 deref=0 filter="(objectClass=*)" Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1008 op=1 SRCH attr=* + Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1008 op=1 INTERM oid=1.3.6.1.4.1.4203.1.9.1.4 Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1007 op=1 SRCH base="dc=example,dc=com" scope=2 deref=0 filter="(objectClass=*)" Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1007 op=1 SRCH attr=* + Sep 22 10:13:02 dhcp204 slapd[2689]: srs csn 20110922081225.199039Z#000000#000#000000 Sep 22 10:13:02 dhcp204 slapd[2689]: log csn 20110922081225.199039Z#000000#000#000000 Sep 22 10:13:02 dhcp204 slapd[2689]: cmp 0, too old Sep 22 10:13:02 dhcp204 slapd[2689]: Entry cn=seb,ou=orgunit,o=org,dc=example,dc=com changed by peer, ignored Sep 22 10:13:02 dhcp204 slapd[2689]: syncprov_search_response: cookie=rid=003,csn=20110922081235.611410Z#000000#000#000000 Sep 22 10:13:02 dhcp204 slapd[2689]: conn=1007 op=1 INTERM oid=1.3.6.1.4.1.4203.1.9.1.4 Sep 22 10:17:57 dhcp204 slapd[2689]: do_syncrep2: rid=002 LDAP_RES_INTERMEDIATE - REFRESH_DELETE Sep 22 10:17:57 dhcp204 slapd[2689]: do_syncrep2: rid=004 LDAP_RES_INTERMEDIATE - REFRESH_DELET
On 24/09/2011 01:43, Howard Chu wrote:
Sébastien Bernard wrote:
Hi,
I've setup a multimaster cluster composed of two machine (in my example 192.168.0.204 and 192.168.0.197). Everything is working ok and both side are replicating ok.
However, I've a problem I'd like to submit to your sagacity.
When I put down a server, and modify the other server (delete or add), when the first server comes back, the modifications are not pushed in the old server. Server 1 says Entry cn=seb,ou=orgunit,o=org,dc=example,dc=com changed by peer, ignored
You have not provided enough useful information (OpenLDAP version, exact server configurations, which one is "server 1" in your description) to be certain. But most likely you have not configured their ServerIDs correctly.
OpenLDAP is 2.4.26-2 from fedora. I'll include the cn=config tree I have. The cn=config is replicated between both servers (as stated in the chap 18 of the admin guide). Server 1 is 192.168.0.204. Server 2 is 192.168.0.197. I used server 1 to import all the nodes.
Configuration is included as attachement. All I can say is that I tried to follow the instruction in the guide.
I did not pretend I fully understood what I was doing, but I managed to have replication both ways working.
A few points stays in the dark, like the numbering of the rid for replication: are the rid per branch or global to the slapd ? Should one assign with an incremental policy ?
Next the ServerIds are only declared in the cn=config node. Shouldn't they be declared in the dc=aaa,dc=fr branch ?
Sincerely
Seb
Le 24/09/2011 01:43, Howard Chu a écrit :
Sébastien Bernard wrote:
Hi,
I've setup a multimaster cluster composed of two machine (in my example 192.168.0.204 and 192.168.0.197). Everything is working ok and both side are replicating ok.
However, I've a problem I'd like to submit to your sagacity.
When I put down a server, and modify the other server (delete or add), when the first server comes back, the modifications are not pushed in the old server. Server 1 says Entry cn=seb,ou=orgunit,o=org,dc=example,dc=com changed by peer, ignored
You have not provided enough useful information (OpenLDAP version, exact server configurations, which one is "server 1" in your description) to be certain. But most likely you have not configured their ServerIDs correctly.
Adding new entries works ok and synchronisation happens but for the nodes altered while one of the servers was down, the modifications are lost (or more precisely ignored by the other).
My questions: I Is this normal behaviour (Maybe I got the configuration wrong) ? How may I force the missing entries to be replicated to the other ? (Only solution I found is to wipe the entire database on the down server that force a replication from its peer).
sincerely, Seb
Ok, now that I've understood my mistakes, everything is working as soon as I modified the command line of each server so that they can manage know who they are.
The warning is in the admin guide but not visible enough. This is the reason why I'm proposing the little modification attached to my mail.
1- It explicitly state that the server must be run with one of the url. 2- it make the replication of cn=config not mandatory. This is confusing at best, and the way I understood it is that you must replicate cn=config for the N-way master to work where as it is just a convenient way to deploy replication configuration on each server.
The result is not really crytal clear. IMHO, the best way would be to separate the two options in two example, and state explicitely the benefits of replicating the cn=config too.
Seb
--On Monday, October 17, 2011 4:38 PM +0200 Sébastien Bernard seb@frankengul.org wrote:
Le 24/09/2011 01:43, Howard Chu a écrit :
Sébastien Bernard wrote:
Hi,
I've setup a multimaster cluster composed of two machine (in my example 192.168.0.204 and 192.168.0.197). Everything is working ok and both side are replicating ok.
However, I've a problem I'd like to submit to your sagacity.
When I put down a server, and modify the other server (delete or add), when the first server comes back, the modifications are not pushed in the old server. Server 1 says Entry cn=seb,ou=orgunit,o=org,dc=example,dc=com changed by peer, ignored
You have not provided enough useful information (OpenLDAP version, exact server configurations, which one is "server 1" in your description) to be certain. But most likely you have not configured their ServerIDs correctly.
Adding new entries works ok and synchronisation happens but for the nodes altered while one of the servers was down, the modifications are lost (or more precisely ignored by the other).
My questions: I Is this normal behaviour (Maybe I got the configuration wrong) ? How may I force the missing entries to be replicated to the other ? (Only solution I found is to wipe the entire database on the down server that force a replication from its peer).
sincerely, Seb
Ok, now that I've understood my mistakes, everything is working as soon as I modified the command line of each server so that they can manage know who they are.
The warning is in the admin guide but not visible enough. This is the reason why I'm proposing the little modification attached to my mail.
1- It explicitly state that the server must be run with one of the url. 2- it make the replication of cn=config not mandatory. This is confusing at best, and the way I understood it is that you must replicate cn=config for the N-way master to work where as it is just a convenient way to deploy replication configuration on each server.
The result is not really crytal clear. IMHO, the best way would be to separate the two options in two example, and state explicitely the benefits of replicating the cn=config too.
Please file an ITS about this at http://www.openldap.org/its/ so the documentation updates can be appropriately tracked. Thanks!
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
openldap-technical@openldap.org