All,
I have two identical servers (RHEL based VMs, server1 and server3) running 2.4.23 openldap.
built with these options:
--with-tls --prefix=/etc/operator/openldap --enable-syncprov --enable-syslog --enable-crypt -
I have the strangest problem, and am desperate for any insight you might provide
If I make a change on server3, then everything is fine, and the change is replicated to server1 If I make a change on server1 then server1 changes, but no changes are seen on server 3.
looking at the logs, on server1, Using tcpdump to sniff the connection, when a change is made on server1, it doesn't even attempt to contact server3.
As far as I can tell the configs are identical, and I have no clue whats causing this. Any hint at all would be gratefully accepted. Configs from both machines attached. server1 and server3(output of ldapsearch on cn=config) Also attached, logs (olcLogLevel is Sync) of the results when I change a value (olcLogLevel) on the two servers (change-on-server1 and change-on-server3)
Thanks in advance for any help you can give. Alister
-- Alister Forbes TACSUNS _.|._.|._ Cisco Systems
Please avoid sending me Word or PowerPoint attachments. See - http://www.gnu.org/philosophy/no-word-attachments.html
Hello Alister,
Le 23/09/2010 12:04, Alister Forbes a écrit :
All,
I have two identical servers (RHEL based VMs, server1 and server3) running 2.4.23 openldap.
built with these options:
--with-tls --prefix=/etc/operator/openldap --enable-syncprov --enable-syslog --enable-crypt -
I have the strangest problem, and am desperate for any insight you might provide
If I make a change on server3, then everything is fine, and the change is replicated to server1 If I make a change on server1 then server1 changes, but no changes are seen on server 3.
looking at the logs, on server1, Using tcpdump to sniff the connection, when a change is made on server1, it doesn't even attempt to contact server3.
As far as I can tell the configs are identical, and I have no clue whats causing this. Any hint at all would be gratefully accepted. Configs from both machines attached. server1 and server3(output of ldapsearch on cn=config) Also attached, logs (olcLogLevel is Sync) of the results when I change a value (olcLogLevel) on the two servers (change-on-server1 and change-on-server3)
I note several things:
The retry value of your syncrepl statements is set so that only a limited number of retries will occur. It is possible that (during some downtime) slapd has used up all these retries, and given up on a particular syncrepl consumer. A restart of slapd should solve this.
Looking at the logs, it seems that server3 at least is confused as to who is who, since it is sending out the change to both server1 and itself (and then dismissing it with "CSN too old, ignoring").
However, since one of your changes is to change the log level to "stats", therefore excluding "sync", it's unclear how trustworthy these logs are...
I suggest starting over: restart both instances of slapd with -c rid=001 -c rid=003, to reset the replication status, and take it from there.
Hope this helps, Jonathan
Hi Jonathon,
On 23 Sep 2010, at 15:24, Jonathan CLARKE wrote:
Hello Alister,
Le 23/09/2010 12:04, Alister Forbes a écrit :
All,
I have two identical servers (RHEL based VMs, server1 and server3) running 2.4.23 openldap.
built with these options:
--with-tls --prefix=/etc/operator/openldap --enable-syncprov --enable-syslog --enable-crypt -
I have the strangest problem, and am desperate for any insight you might provide
If I make a change on server3, then everything is fine, and the change is replicated to server1 If I make a change on server1 then server1 changes, but no changes are seen on server 3.
looking at the logs, on server1, Using tcpdump to sniff the connection, when a change is made on server1, it doesn't even attempt to contact server3.
As far as I can tell the configs are identical, and I have no clue whats causing this. Any hint at all would be gratefully accepted. Configs from both machines attached. server1 and server3(output of ldapsearch on cn=config) Also attached, logs (olcLogLevel is Sync) of the results when I change a value (olcLogLevel) on the two servers (change-on-server1 and change-on-server3)
I note several things:
The retry value of your syncrepl statements is set so that only a limited number of retries will occur. It is possible that (during some downtime) slapd has used up all these retries, and given up on a particular syncrepl consumer. A restart of slapd should solve this.
Looking at the logs, it seems that server3 at least is confused as to who is who, since it is sending out the change to both server1 and itself (and then dismissing it with "CSN too old, ignoring").
However, since one of your changes is to change the log level to "stats", therefore excluding "sync", it's unclear how trustworthy these logs are...
I suggest starting over: restart both instances of slapd with -c rid=001 -c rid=003, to reset the replication status, and take it from there.
Hope this helps, Jonathan --
Thanks very much for this, I should have been clearer in my original mail. Although I did make changes to the olcLogLevel in the ldapmodify commands, at the beginning of each command olcLogLevel was always set to Sync.
I did restart, with the -c options, but I'm still seeing exactly the same behaviour
Looking at my configs again, I still see only one ContextCSN on server3, and two on server1.
Any suggestions? Alister
========================================== Jonathan CLARKE
Normation 44 rue Cauchy, 94110 Arcueil, France
Telephone: +33 (0)1 83 62 26 96
Web: http://www.normation.com/
-- Alister Forbes Work: +32 2 704 5762 Internal: 322 5762 a@cisco.com TACSUNS _.|._.|._ Cisco Systems
Please avoid sending me Word or PowerPoint attachments. See - http://www.gnu.org/philosophy/no-word-attachments.html
Hi,
Le 24/09/2010 07:31, Alister Forbes a écrit :
Hi Jonathon,
On 23 Sep 2010, at 15:24, Jonathan CLARKE wrote:
Hello Alister,
Le 23/09/2010 12:04, Alister Forbes a écrit :
All,
I have two identical servers (RHEL based VMs, server1 and server3) running 2.4.23 openldap.
built with these options:
--with-tls --prefix=/etc/operator/openldap --enable-syncprov --enable-syslog --enable-crypt -
I have the strangest problem, and am desperate for any insight you might provide
If I make a change on server3, then everything is fine, and the change is replicated to server1 If I make a change on server1 then server1 changes, but no changes are seen on server 3.
looking at the logs, on server1, Using tcpdump to sniff the connection, when a change is made on server1, it doesn't even attempt to contact server3.
As far as I can tell the configs are identical, and I have no clue whats causing this. Any hint at all would be gratefully accepted. Configs from both machines attached. server1 and server3(output of ldapsearch on cn=config) Also attached, logs (olcLogLevel is Sync) of the results when I change a value (olcLogLevel) on the two servers (change-on-server1 and change-on-server3)
I note several things:
The retry value of your syncrepl statements is set so that only a limited number of retries will occur. It is possible that (during some downtime) slapd has used up all these retries, and given up on a particular syncrepl consumer. A restart of slapd should solve this.
Looking at the logs, it seems that server3 at least is confused as to who is who, since it is sending out the change to both server1 and itself (and then dismissing it with "CSN too old, ignoring").
However, since one of your changes is to change the log level to "stats", therefore excluding "sync", it's unclear how trustworthy these logs are...
I suggest starting over: restart both instances of slapd with -c rid=001 -c rid=003, to reset the replication status, and take it from there.
Hope this helps, Jonathan --
Thanks very much for this, I should have been clearer in my original mail. Although I did make changes to the olcLogLevel in the ldapmodify commands, at the beginning of each command olcLogLevel was always set to Sync.
I did restart, with the -c options, but I'm still seeing exactly the same behaviour
Looking at my configs again, I still see only one ContextCSN on server3, and two on server1.
Any suggestions?
In this case, I suspect something is wrong with your DNS/IP setup. slapd identifies itself against the values in olcServerID by checking the host's FQDN (see the output of hostname --fqdn) and the hostnames in the -h option passed on startup.
Make sure your /etc/hosts contains sensible values for all names and IPs involved, and that you're running both slapds with something like -h ldap://serverN.example.com/.
If this still fails, maybe post a log excerpt from slapd startup with log levels config and sync?
Jonathan
Hi Jonathon,
On 24 Sep 2010, at 08:38, Jonathan CLARKE wrote:
Hi,
Le 24/09/2010 07:31, Alister Forbes a écrit :
Hi Jonathon,
On 23 Sep 2010, at 15:24, Jonathan CLARKE wrote:
Hello Alister,
Le 23/09/2010 12:04, Alister Forbes a écrit :
All,
I have two identical servers (RHEL based VMs, server1 and server3) running 2.4.23 openldap.
built with these options:
--with-tls --prefix=/etc/operator/openldap --enable-syncprov --enable-syslog --enable-crypt -
I have the strangest problem, and am desperate for any insight you might provide
If I make a change on server3, then everything is fine, and the change is replicated to server1 If I make a change on server1 then server1 changes, but no changes are seen on server 3.
looking at the logs, on server1, Using tcpdump to sniff the connection, when a change is made on server1, it doesn't even attempt to contact server3.
As far as I can tell the configs are identical, and I have no clue whats causing this. Any hint at all would be gratefully accepted. Configs from both machines attached. server1 and server3(output of ldapsearch on cn=config) Also attached, logs (olcLogLevel is Sync) of the results when I change a value (olcLogLevel) on the two servers (change-on-server1 and change-on-server3)
I note several things:
The retry value of your syncrepl statements is set so that only a limited number of retries will occur. It is possible that (during some downtime) slapd has used up all these retries, and given up on a particular syncrepl consumer. A restart of slapd should solve this.
Looking at the logs, it seems that server3 at least is confused as to who is who, since it is sending out the change to both server1 and itself (and then dismissing it with "CSN too old, ignoring").
However, since one of your changes is to change the log level to "stats", therefore excluding "sync", it's unclear how trustworthy these logs are...
I suggest starting over: restart both instances of slapd with -c rid=001 -c rid=003, to reset the replication status, and take it from there.
Hope this helps, Jonathan --
Thanks very much for this, I should have been clearer in my original mail. Although I did make changes to the olcLogLevel in the ldapmodify commands, at the beginning of each command olcLogLevel was always set to Sync.
I did restart, with the -c options, but I'm still seeing exactly the same behaviour
Looking at my configs again, I still see only one ContextCSN on server3, and two on server1.
Any suggestions?
In this case, I suspect something is wrong with your DNS/IP setup. slapd identifies itself against the values in olcServerID by checking the host's FQDN (see the output of hostname --fqdn) and the hostnames in the -h option passed on startup.
Make sure your /etc/hosts contains sensible values for all names and IPs involved, and that you're running both slapds with something like -h ldap://serverN.example.com/.
If this still fails, maybe post a log excerpt from slapd startup with log levels config and sync?
Jonathan
I think you may be on to something here.. if I launch server3 with slapd -h ldap://server3.example.com then I can't connect to it with ldapsearch any more.. but server1 does Sync with it.
Now I just have to work out the magic incantation that lets both myself, and another master talk to an ldap server at the same time.
from server1:
netstat -an | grep 389 tcp 0 0 0.0.0.0:389 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:389 127.0.0.1:62264 ESTABLISHED tcp 0 0 127.0.0.1:62264 127.0.0.1:389 ESTABLISHED tcp 0 0 <SNIP>.181.39:389 <SNIP>.181.2:33860 ESTABLISHED tcp 0 0 :::389 :::* LISTEN
from server3: # netstat -an | grep 389 tcp 0 0 127.0.0.1:389 0.0.0.0:* LISTEN tcp 0 0 <SNIP>.181.2:33860 <SNIP>.181.39:389 ESTABLISHED tcp 0 0 ::1:389 :::* LISTEN
This looks like it's not listening for any new connections once the connection to server1 is set up. Does that sound right?
Alister
--
Jonathan CLARKE
Normation 44 rue Cauchy, 94110 Arcueil, France
Telephone: +33 (0)1 83 62 26 96
Web: http://www.normation.com/
-- Alister Forbes Work: +32 2 704 5762 Internal: 322 5762 a@cisco.com TACSUNS _.|._.|._ Cisco Systems
Please avoid sending me Word or PowerPoint attachments. See - http://www.gnu.org/philosophy/no-word-attachments.html
On 24/09/2010 10:24, Alister Forbes wrote:
Hi Jonathon,
On 24 Sep 2010, at 08:38, Jonathan CLARKE wrote:
Hi,
Le 24/09/2010 07:31, Alister Forbes a écrit :
Hi Jonathon,
On 23 Sep 2010, at 15:24, Jonathan CLARKE wrote:
Hello Alister,
Le 23/09/2010 12:04, Alister Forbes a écrit :
All,
I have two identical servers (RHEL based VMs, server1 and server3) running 2.4.23 openldap.
built with these options:
--with-tls --prefix=/etc/operator/openldap --enable-syncprov --enable-syslog --enable-crypt -
I have the strangest problem, and am desperate for any insight you might provide
If I make a change on server3, then everything is fine, and the change is replicated to server1 If I make a change on server1 then server1 changes, but no changes are seen on server 3.
looking at the logs, on server1, Using tcpdump to sniff the connection, when a change is made on server1, it doesn't even attempt to contact server3.
As far as I can tell the configs are identical, and I have no clue whats causing this. Any hint at all would be gratefully accepted. Configs from both machines attached. server1 and server3(output of ldapsearch on cn=config) Also attached, logs (olcLogLevel is Sync) of the results when I change a value (olcLogLevel) on the two servers (change-on-server1 and change-on-server3)
I note several things:
The retry value of your syncrepl statements is set so that only a limited number of retries will occur. It is possible that (during some downtime) slapd has used up all these retries, and given up on a particular syncrepl consumer. A restart of slapd should solve this.
Looking at the logs, it seems that server3 at least is confused as to who is who, since it is sending out the change to both server1 and itself (and then dismissing it with "CSN too old, ignoring").
However, since one of your changes is to change the log level to "stats", therefore excluding "sync", it's unclear how trustworthy these logs are...
I suggest starting over: restart both instances of slapd with -c rid=001 -c rid=003, to reset the replication status, and take it from there.
Hope this helps, Jonathan --
Thanks very much for this, I should have been clearer in my original mail. Although I did make changes to the olcLogLevel in the ldapmodify commands, at the beginning of each command olcLogLevel was always set to Sync.
I did restart, with the -c options, but I'm still seeing exactly the same behaviour
Looking at my configs again, I still see only one ContextCSN on server3, and two on server1.
Any suggestions?
In this case, I suspect something is wrong with your DNS/IP setup. slapd identifies itself against the values in olcServerID by checking the host's FQDN (see the output of hostname --fqdn) and the hostnames in the -h option passed on startup.
Make sure your /etc/hosts contains sensible values for all names and IPs involved, and that you're running both slapds with something like -h ldap://serverN.example.com/.
If this still fails, maybe post a log excerpt from slapd startup with log levels config and sync?
Jonathan
I think you may be on to something here.. if I launch server3 with slapd -h ldap://server3.example.com then I can't connect to it with ldapsearch any more.. but server1 does Sync with it.
Now I just have to work out the magic incantation that lets both myself, and another master talk to an ldap server at the same time.
from server1:
netstat -an | grep 389 tcp 0 0 0.0.0.0:389 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:389 127.0.0.1:62264 ESTABLISHED tcp 0 0 127.0.0.1:62264 127.0.0.1:389 ESTABLISHED tcp 0 0<SNIP>.181.39:389<SNIP>.181.2:33860 ESTABLISHED tcp 0 0 :::389 :::* LISTEN
from server3: # netstat -an | grep 389 tcp 0 0 127.0.0.1:389 0.0.0.0:* LISTEN tcp 0 0<SNIP>.181.2:33860<SNIP>.181.39:389 ESTABLISHED tcp 0 0 ::1:389 :::* LISTEN
This looks like it's not listening for any new connections once the connection to server1 is set up. Does that sound right?
No. If you run with -h ldap://server3.example.com, then slapd listens *only* on that interface. You would have to specify -H ldap://server3.example.com to ldapsearch for it to connect to it.
A simple solution is to get slapd listening on both interfaces: -h "ldap://server3.example.com ldap://localhost".
Jonathan
On 24 Sep 2010, at 10:36, Jonathan CLARKE wrote:
On 24/09/2010 10:24, Alister Forbes wrote:
Hi Jonathon,
On 24 Sep 2010, at 08:38, Jonathan CLARKE wrote:
Hi,
Le 24/09/2010 07:31, Alister Forbes a écrit :
Hi Jonathon,
On 23 Sep 2010, at 15:24, Jonathan CLARKE wrote:
Hello Alister,
Le 23/09/2010 12:04, Alister Forbes a écrit :
All,
I have two identical servers (RHEL based VMs, server1 and server3) running 2.4.23 openldap.
built with these options:
--with-tls --prefix=/etc/operator/openldap --enable-syncprov --enable-syslog --enable-crypt -
I have the strangest problem, and am desperate for any insight you might provide
If I make a change on server3, then everything is fine, and the change is replicated to server1 If I make a change on server1 then server1 changes, but no changes are seen on server 3.
looking at the logs, on server1, Using tcpdump to sniff the connection, when a change is made on server1, it doesn't even attempt to contact server3.
As far as I can tell the configs are identical, and I have no clue whats causing this. Any hint at all would be gratefully accepted. Configs from both machines attached. server1 and server3(output of ldapsearch on cn=config) Also attached, logs (olcLogLevel is Sync) of the results when I change a value (olcLogLevel) on the two servers (change-on-server1 and change-on-server3)
I note several things:
The retry value of your syncrepl statements is set so that only a limited number of retries will occur. It is possible that (during some downtime) slapd has used up all these retries, and given up on a particular syncrepl consumer. A restart of slapd should solve this.
Looking at the logs, it seems that server3 at least is confused as to who is who, since it is sending out the change to both server1 and itself (and then dismissing it with "CSN too old, ignoring").
However, since one of your changes is to change the log level to "stats", therefore excluding "sync", it's unclear how trustworthy these logs are...
I suggest starting over: restart both instances of slapd with -c rid=001 -c rid=003, to reset the replication status, and take it from there.
Hope this helps, Jonathan --
Thanks very much for this, I should have been clearer in my original mail. Although I did make changes to the olcLogLevel in the ldapmodify commands, at the beginning of each command olcLogLevel was always set to Sync.
I did restart, with the -c options, but I'm still seeing exactly the same behaviour
Looking at my configs again, I still see only one ContextCSN on server3, and two on server1.
Any suggestions?
In this case, I suspect something is wrong with your DNS/IP setup. slapd identifies itself against the values in olcServerID by checking the host's FQDN (see the output of hostname --fqdn) and the hostnames in the -h option passed on startup.
Make sure your /etc/hosts contains sensible values for all names and IPs involved, and that you're running both slapds with something like -h ldap://serverN.example.com/.
If this still fails, maybe post a log excerpt from slapd startup with log levels config and sync?
Jonathan
I think you may be on to something here.. if I launch server3 with slapd -h ldap://server3.example.com then I can't connect to it with ldapsearch any more.. but server1 does Sync with it.
Now I just have to work out the magic incantation that lets both myself, and another master talk to an ldap server at the same time.
from server1:
netstat -an | grep 389 tcp 0 0 0.0.0.0:389 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:389 127.0.0.1:62264 ESTABLISHED tcp 0 0 127.0.0.1:62264 127.0.0.1:389 ESTABLISHED tcp 0 0<SNIP>.181.39:389<SNIP>.181.2:33860 ESTABLISHED tcp 0 0 :::389 :::* LISTEN
from server3: # netstat -an | grep 389 tcp 0 0 127.0.0.1:389 0.0.0.0:* LISTEN tcp 0 0<SNIP>.181.2:33860<SNIP>.181.39:389 ESTABLISHED tcp 0 0 ::1:389 :::* LISTEN
This looks like it's not listening for any new connections once the connection to server1 is set up. Does that sound right?
No. If you run with -h ldap://server3.example.com, then slapd listens *only* on that interface. You would have to specify -H ldap://server3.example.com to ldapsearch for it to connect to it.
A simple solution is to get slapd listening on both interfaces: -h "ldap://server3.example.com ldap://localhost".
Jonathan
But if I run it with no -h option it does listen on all interfaces (at least according to the man page) here's what I used to start server3:
# slapd -h ldap://server3.example.com -c rid=001 -c rid=003 # netstat -an | grep 389 tcp 0 0 127.0.0.1:389 0.0.0.0:* LISTEN tcp 0 0 <SNIP>.181.2:47363 <SNIP>.181.39:389 ESTABLISHED tcp 0 0 <SNIP>.181.2:33860 <SNIP>.181.39:389 TIME_WAIT tcp 0 0 ::1:389 :::* LISTEN
and here, the output from my local machine:
$ ldapsearch -x -h server3-b 'cn=config' -D 'cn=admin,cn=config' -wpass123 -s base olcLogLevel ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
$ ldapsearch -x -H ldap://server3.example.com -b 'cn=config' -D 'cn=admin,cn=config' -wpass123 -s base olcLogLevel ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
My concern is.. that as the connection is estblished with <SNIP>.181.3 (server1) there doesn't seem to be another listener waiting for new connections. (localhost is listening, but that doesn' t help if I want to connect from a remote client does it?)
Alister
--
Jonathan CLARKE
Normation 44 rue Cauchy, 94110 Arcueil, France
Telephone: +33 (0)1 83 62 26 96 Mobile: +33 (0)6 99 60 03 10
Web: http://www.normation.com/
-- Alister Forbes Work: +32 2 704 5762 Internal: 322 5762 a@cisco.com TACSUNS _.|._.|._ Cisco Systems
Please avoid sending me Word or PowerPoint attachments. See - http://www.gnu.org/philosophy/no-word-attachments.html
openldap-technical@openldap.org