Below is a sample configuration allowing to reproduce the problem :
Three openldap data instances configured as follows: include /opt/openldap/etc/openldap/schema/core.schema include /opt/openldap/etc/openldap/schema/cosine.schema include /opt/openldap/etc/openldap/schema/inetorgperson.schema include /opt/openldap/etc/openldap/schema/nis.schema include /opt/openldap/etc/openldap/schema/dyngroup.schema include /opt/openldap/etc/openldap/schema/misc.schema pidfile /opt/openldap/var/run/server1.pid argsfile /opt/openldap/var/run/server1.args loglevel stats database bdb suffix ou=orgunit,o=gouv,c=fr directory /opt/openldap/var/server1
Note: server1 is changed by server2 and server3 for other instances.
Each instance contains the following data: (only 4 entries): dn: ou=orgunit,o=gouv,c=fr objectClass: top objectClass: organizationalUnit ou: orgunit
dn: ou=dept1,ou=orgunit,o=gouv,c=fr ou: dept1 objectClass: top objectClass: organizationalUnit
dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr objectClass: top objectClass: person objectClass: organizationalPerson objectClass: inetOrgPerson mail: user11@server1.com cn: User 11 uid: user11 givenName: User sn: 11
dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr objectClass: top objectClass: person objectClass: organizationalPerson objectClass: inetOrgPerson mail: user12@server1.com cn: User 12 uid: user12 givenName: User sn: 12
Note: user1x and dept1 are substituted in instances 2 and 3 by user2x, dept2, user3x,dept3.
Data instances are launched using this command: /opt/openldap/libexec/slapd -n server1 -f /opt/openldap/etc/openldap/server1.conf -h ldap://0.0.0.0:1001/ /opt/openldap/libexec/slapd -n server2 -f /opt/openldap/etc/openldap/server2.conf -h ldap://0.0.0.0:1002/ /opt/openldap/libexec/slapd -n server3 -f /opt/openldap/etc/openldap/server3.conf -h ldap://0.0.0.0:1003/
Meta instance is configured as follows: include /opt/openldap/etc/openldap/schema/core.schema include /opt/openldap/etc/openldap/schema/cosine.schema include /opt/openldap/etc/openldap/schema/inetorgperson.schema include /opt/openldap/etc/openldap/schema/nis.schema include /opt/openldap/etc/openldap/schema/dyngroup.schema include /opt/openldap/etc/openldap/schema/anais.schema include /opt/openldap/etc/openldap/schema/misc.schema pidfile /opt/openldap/var/run/meta.pid argsfile /opt/openldap/var/run/meta.args database meta suffix ou=orgunit,o=gouv,c=fr uri ldap://localhost:1001/ou=dept1,ou=orgunit,o=gouv,c=fr #network-timeout 5 #timeout 3 uri ldap://localhost:1002/ou=dept2,ou=orgunit,o=gouv,c=fr #network-timeout 5 #timeout 4 uri ldap://localhost:1003/ou=dept3,ou=orgunit,o=gouv,c=fr #network-timeout 5 #timeout 4
and it is launched as follows: /opt/openldap/libexec/slapd -n meta -f /opt/openldap/etc/openldap/meta.conf -h ldap://0.0.0.0:1000/ -d 256
# test with the 3 servers up /opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b ou=orgunit,o=gouv,c=fr objectclass=person dn |grep dn: dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr dn: uid=user31,ou=dept3,ou=orgunit,o=gouv,c=fr dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr dn: uid=user32,ou=dept3,ou=orgunit,o=gouv,c=fr dn: uid=user21,ou=dept2,ou=orgunit,o=gouv,c=fr dn: uid=user22,ou=dept2,ou=orgunit,o=gouv,c=fr => entries from the three servers are returned
# stop server 2 (kill -INT ...) and perfom a new search: /opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b ou=orgunit,o=gouv,c=fr objectclass=person dn |grep dn: dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr dn: uid=user31,ou=dept3,ou=orgunit,o=gouv,c=fr dn: uid=user32,ou=dept3,ou=orgunit,o=gouv,c=fr => looks good : entries from server1 and server 3 are returned
Below are the meta instance logs: conn=1001 fd=9 ACCEPT from IP=172.30.8.13:55048 (IP=0.0.0.0:1000) conn=1001 op=0 BIND dn="" method=128 conn=1001 op=0 RESULT tag=97 err=0 text= conn=1001 op=1 SRCH base="ou=orgunit,o=gouv,c=fr" scope=2 deref=0 filter="(objectClass=person)" conn=1001 op=1 SRCH attr=dn conn=1001 op=1 meta_back_retry[1]: retrying URI="ldap://localhost:1002" DN="". conn=1001 op=1 meta_back_retry[1]: meta_back_single_dobind=52 conn=1001 op=1 SEARCH RESULT tag=101 err=0 nentries=4 text= conn=1001 op=2 UNBIND conn=1001 fd=9 closed => looks good as nentries=4
# perform numerous new search without changing anything: [root@pp-ae2-proxy2 log]# /opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b ou=orgunit,o=gouv,c=fr objectclass=person dn |grep dn: => nothing returned
Below are the corresponding logs: conn=1002 fd=9 ACCEPT from IP=172.30.8.13:55049 (IP=0.0.0.0:1000) conn=1002 op=0 BIND dn="" method=128 conn=1002 op=0 RESULT tag=97 err=0 text= conn=1002 op=1 SRCH base="ou=orgunit,o=gouv,c=fr" scope=2 deref=0 filter="(objectClass=person)" conn=1002 op=1 SRCH attr=dn conn=1002 op=1 meta_search_dobind_init[1]: retrying URI="ldap://localhost:1002" DN="". conn=1002 op=1 SEARCH RESULT tag=101 err=0 nentries=0 text= conn=1002 op=2 UNBIND conn=1002 fd=9 closed => looks bad as nentries=0 => Only the first search after server2 stop is successfull.
# new search but using server1 ou: /opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b ou=dept1,ou=orgunit,o=gouv,c=fr objectclass=person dn |grep dn: dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr => looks good
# same search as earlier i.e. using root node: /opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b ou=orgunit,o=gouv,c=fr objectclass=person dn |grep dn: dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr => Looks good also. It looks like all is OK but only once a channel is opened to server1 using another manner
# new search but using server3 base object: /opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b ou=dept3,ou=orgunit,o=gouv,c=fr objectclass=person dn |grep dn: dn: uid=user31,ou=dept3,ou=orgunit,o=gouv,c=fr dn: uid=user32,ou=dept3,ou=orgunit,o=gouv,c=fr => looks good
# new search but using slapd-meta base object: /opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b ou=orgunit,o=gouv,c=fr objectclass=person dn |grep dn: dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr dn: uid=user31,ou=dept3,ou=orgunit,o=gouv,c=fr dn: uid=user32,ou=dept3,ou=orgunit,o=gouv,c=fr => entries from server1 and server3 are returned => this confirms lookups in server1 and server3 are not performed until a channel is opened to both of them using their repective base object => another strange behavior : if search using server3 ou is performed before serach using server ou, then next search attempt using root node allows to retrieve entries from both server1 and server3 ...
# new search after server2 restart: /opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b ou=orgunit,o=gouv,c=fr objectclass=person dn |grep dn: dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr dn: uid=user21,ou=dept2,ou=orgunit,o=gouv,c=fr dn: uid=user31,ou=dept3,ou=orgunit,o=gouv,c=fr dn: uid=user22,ou=dept2,ou=orgunit,o=gouv,c=fr dn: uid=user32,ou=dept3,ou=orgunit,o=gouv,c=fr => good, all entries are returned
# new search after meta instance restart while server2 is already stopped opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b ou=orgunit,o=gouv,c=fr objectclass=person dn |grep dn: => unlike the previous test case (server2 stopped while meta instance is already running) we do not see the single successfull search.
Then, behaviour is the same i.e. search on root node works again, but only once a search has been performed using ou=dept1 and ou=dept3.
In addition, behaviour is slightly different adding the "conn-ttl" parameter set to 3 (3 seconds). I could expose it in a new post.
Thanks for anyone who could help to identify whether it is a misconfiguration or a bug.
Michel Gruau
Message du 19/08/11 13:13 De : "Michel Gruau" A : "openldap-technical openldap org" Copie à : Objet : Slapd-meta stop at the first unreachable candidate
Hello,
It have a slapd-meta configuration as follows:
database meta suffix dc=com uri ldap://server1:389/dc=suffix1,dc=com uri ldap://server2:389/dc=suffix2,dc=com uri ldap://server3:389/dc=suffix3,dc=com
I performed numerous tests using "base=com" and changing the order of the above list of uri (in slapd.cnof) and I see that as soon as a candidate directory is unreachable, all other directories located below the directory in failure are not requested by the proxy. For instance, in example below: - if server2 is down, then server 3 is not requeted - if server1 is down, then none of the directories is requested.
I have the felling this is a bug ... could you confirm ?
FYI, I also tried the "'onerrr continue" config, but did not change annything
Thanks in advance.
Michel
Une messagerie gratuite, garantie à vie et des services en plus, ça vous tente ? Je crée ma boîte mail www.laposte.net
openldap-technical@openldap.org