Hello list.
I recently faced a strange issue while upgrading from openldap 2.3 to 2.4 (from centos 5.7 to 6.2, actually): the change was transparent for every applications excepted Zimbra, for which any authentication attempt was suffering from an unexplained 30s additional delay. Just switching from explicit TLS usage on port 389 to explicit SSL usage on port 636 was enough to fix the issue.
The logs shows than the delay occurs between the moment where the bind operation succeed, and the moment the client connection get closed:
Jun 14 11:56:04 ildapslave2 slapd[16618]: conn=2787 fd=135 ACCEPT from IP=128.93.142.13:41191 (IP=0.0.0.0:389) Jun 14 11:56:04 ildapslave2 slapd[16618]: conn=2787 op=0 EXT oid=1.3.6.1.4.1.1466.20037 Jun 14 11:56:04 ildapslave2 slapd[16618]: conn=2787 op=0 STARTTLS Jun 14 11:56:04 ildapslave2 slapd[16618]: conn=2787 op=0 RESULT oid= err=0 text= Jun 14 11:56:04 ildapslave2 slapd[16618]: conn=2787 fd=135 TLS established tls_ssf=256 ssf=256 Jun 14 11:56:04 ildapslave2 slapd[16618]: conn=2787 op=1 BIND dn="uid=fauge00C,ou=people,dc=inria,dc=fr" method=128 Jun 14 11:56:04 ildapslave2 slapd[16618]: conn=2787 op=1 BIND dn="uid=fauge00C,ou=people,dc=inria,dc=fr" mech=SIMPLE ssf=0 Jun 14 11:56:04 ildapslave2 slapd[16618]: conn=2787 op=1 RESULT tag=97 err=0 text= ... Jun 14 11:56:34 ildapslave2 slapd[16618]: conn=2787 fd=135 closed (connection lost)
Before the upgrade, the connection get closed immediatly, and there is no such delay.
Using higher logging level doesn't provide additional useful details, excepted maybe more details about connection termination: Jun 14 12:53:21 ildapslave2 slapd[7156]: connection_read(109): checking for input on id=1135 Jun 14 12:53:21 ildapslave2 slapd[7156]: ber_get_next on fd 109 failed errno=0 (Success) Jun 14 12:53:21 ildapslave2 slapd[7156]: connection_read(109): input error=-2 id=1135, closing.
I'm aware than this behaviour change may actually come from underlying libraries, such as bdb for instance, rather than openldap itself, but that's still quite a curious issue. Does anyone have a clue about this problem ?
--On Monday, June 25, 2012 1:46 PM +0200 Guillaume Rousse guillomovitch@gmail.com wrote:
Hello list.
I recently faced a strange issue while upgrading from openldap 2.3 to 2.4 (from centos 5.7 to 6.2, actually): the change was transparent for every applications excepted Zimbra, for which any authentication attempt was suffering from an unexplained 30s additional delay. Just switching from explicit TLS usage on port 389 to explicit SSL usage on port 636 was enough to fix the issue.
I would use ldapsearch -d -1 to see what function it was hanging in.
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
Le 25/06/2012 20:06, Quanah Gibson-Mount a écrit :
--On Monday, June 25, 2012 1:46 PM +0200 Guillaume Rousse guillomovitch@gmail.com wrote:
Hello list.
I recently faced a strange issue while upgrading from openldap 2.3 to 2.4 (from centos 5.7 to 6.2, actually): the change was transparent for every applications excepted Zimbra, for which any authentication attempt was suffering from an unexplained 30s additional delay. Just switching from explicit TLS usage on port 389 to explicit SSL usage on port 636 was enough to fix the issue.
I would use ldapsearch -d -1 to see what function it was hanging in.
Unfortunatly (sort of), I can't reproduce the issue with any other client as zimbra itself... ldapsearch works fine, even when run from the same host as zimbra.
--On Tuesday, June 26, 2012 10:27 AM +0200 Guillaume Rousse guillomovitch@gmail.com wrote:
Le 25/06/2012 20:06, Quanah Gibson-Mount a écrit :
--On Monday, June 25, 2012 1:46 PM +0200 Guillaume Rousse guillomovitch@gmail.com wrote:
Hello list.
I recently faced a strange issue while upgrading from openldap 2.3 to 2.4 (from centos 5.7 to 6.2, actually): the change was transparent for every applications excepted Zimbra, for which any authentication attempt was suffering from an unexplained 30s additional delay. Just switching from explicit TLS usage on port 389 to explicit SSL usage on port 636 was enough to fix the issue.
I would use ldapsearch -d -1 to see what function it was hanging in.
Unfortunatly (sort of), I can't reproduce the issue with any other client as zimbra itself... ldapsearch works fine, even when run from the same host as zimbra.
ldapsearch against the zimbra server works fine (no delay on closing)? Then that would imply the issue is with whatever client is making the connection to Zimbra. It looks like slapd is simply detecting the connection was never closed properly after about 30 seconds, and taking care of closing it. I don't specifically see any issue here. Initially I thought you were saying you were seeing this issue while initiating a connection, not while closing it.
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
Le 26/06/2012 21:25, Quanah Gibson-Mount a écrit :
--On Tuesday, June 26, 2012 10:27 AM +0200 Guillaume Rousse guillomovitch@gmail.com wrote:
Le 25/06/2012 20:06, Quanah Gibson-Mount a écrit :
--On Monday, June 25, 2012 1:46 PM +0200 Guillaume Rousse guillomovitch@gmail.com wrote:
Hello list.
I recently faced a strange issue while upgrading from openldap 2.3 to 2.4 (from centos 5.7 to 6.2, actually): the change was transparent for every applications excepted Zimbra, for which any authentication attempt was suffering from an unexplained 30s additional delay. Just switching from explicit TLS usage on port 389 to explicit SSL usage on port 636 was enough to fix the issue.
I would use ldapsearch -d -1 to see what function it was hanging in.
Unfortunatly (sort of), I can't reproduce the issue with any other client as zimbra itself... ldapsearch works fine, even when run from the same host as zimbra.
ldapsearch against the zimbra server works fine (no delay on closing)? Then that would imply the issue is with whatever client is making the connection to Zimbra. It looks like slapd is simply detecting the connection was never closed properly after about 30 seconds, and taking care of closing it. I don't specifically see any issue here. Initially I thought you were saying you were seeing this issue while initiating a connection, not while closing it.
Sorry, I'm not a Zimbra admin, and I may have been confusing in my explanations. The problem occurs with Zimbra acting as an LDAP client against an external LDAP server, performing a bind operation for authenticating users, with the following behaviour:
Zimbra against on openldap 2.3.x server, with TLS on port 389: OK Zimbra against on openldap 2.4.x server, on port 636: OK Zimbra against on openldap 2.4.x server, with TLS on port 389: 30s delay
The mail admin is supposed to report a bug to Zimbra support.
--On Wednesday, June 27, 2012 3:31 PM +0200 Guillaume Rousse guillomovitch@gmail.com wrote:
Sorry, I'm not a Zimbra admin, and I may have been confusing in my explanations. The problem occurs with Zimbra acting as an LDAP client against an external LDAP server, performing a bind operation for authenticating users, with the following behaviour:
Zimbra against on openldap 2.3.x server, with TLS on port 389: OK Zimbra against on openldap 2.4.x server, on port 636: OK Zimbra against on openldap 2.4.x server, with TLS on port 389: 30s delay
Ok, so what you are saying is:
You upgraded your OS to CentOS6
You use external auth
The external auth from CentOS6 to your own LDAP server shows a 30 second delay on closing.
That sounds like a bug in Java/JNDI. I did see some 30 second issues with RHEL6, but it was with initiating a connection, not closing it. You can see more about that at https://stomp.colorado.edu/blog/blog/2011/06/29/on-rhel-6-ssh-dns-firewalls-and-slow-logins/
I would note that JNDI behavior varies based on startTLS vs SSL on port 636 as well.
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
Le 28/06/2012 01:41, Quanah Gibson-Mount a écrit :
--On Wednesday, June 27, 2012 3:31 PM +0200 Guillaume Rousse guillomovitch@gmail.com wrote:
Sorry, I'm not a Zimbra admin, and I may have been confusing in my explanations. The problem occurs with Zimbra acting as an LDAP client against an external LDAP server, performing a bind operation for authenticating users, with the following behaviour:
Zimbra against on openldap 2.3.x server, with TLS on port 389: OK Zimbra against on openldap 2.4.x server, on port 636: OK Zimbra against on openldap 2.4.x server, with TLS on port 389: 30s delay
Ok, so what you are saying is:
You upgraded your OS to CentOS6
You use external auth
The external auth from CentOS6 to your own LDAP server shows a 30 second delay on closing.
Almost :)
I upgraded the OS of the machine running the LDAP server, the zimbra server didn't change. So, that's the external auth from a RHEL 5.8 system running Zimbra 7, against a Centos 6.2 system running OpenLDAP 2.4.24.
That sounds like a bug in Java/JNDI. I did see some 30 second issues with RHEL6, but it was with initiating a connection, not closing it.
Indeed. I don't see why either the ldap server or the ldap client would need to perform any DNS query once the connection has been accepted, and the TLS negociation was successful.
You can see more about that at https://stomp.colorado.edu/blog/blog/2011/06/29/on-rhel-6-ssh-dns-firewalls-and-slow-logins/
I would note that JNDI behavior varies based on startTLS vs SSL on port 636 as well.
Interesting. We tried the workaround, on both side (openldap server, then zimbra server), without any behaviour change.
The only thing I'm unsure, tough, is about the moment where a change in resolver configuration has an effect. As the resolver is technicaly a part of the glibc, and can never get reloaded, I guess it uses the file timestamp to detect file modification and reload it immediatly.
Also, I'll try to find some other example of java-based applications acting as LDAP client, in our environement
On Mon, Jun 25, 2012 at 01:46:36PM +0200, Guillaume Rousse wrote:
for which any authentication attempt was suffering from an unexplained 30s additional delay.
Any time I hear of an odd precisely 30 second delay I ask if the configured DNS resolvers are all available for both forward and reverse lookups. That number is commonly associated with a DNS lookup timeout.
openldap-technical@openldap.org