Hi!
On Tue, Jan 22, 2019 at 04:46:14PM +0200, Janne Peltonen wrote:
Regardless, you should at the least update to the latest RHEL7 version from RH to see if it offers any relief from the issue you are encountering. There are also alternatives to the RH build that you can use on RH, such as:
Yeah, we're looking to that already. Should've probably done that the first thing, before asking on the list. Oh well.
We had a look at this, but still ended up having some Start TLS failed errors on the proxy. We were able to create them on a server with no other load, if we hit 200 simultaneous ldaps connections binding as the same user.
Next, we tried Unto Sten's suggestion: we confirmed that the "timeout" variable is zero, so we go into the "else" branch he mentioned; then instead of calling the macro in the else branch, we just directly set tv.sec = 3 and tv.usec = 0 (a quick and dirty hack, I know). After that, we were no longer able to get any Start TLS failed errors on the proxy, and all proxy binds were completed succesfully. To make sure, we downgraded the proxy again, and sure enough, the Start TLS failed errors reappeared, or rather, we began to have some of them again. Upgraded again, and no errors at all.
To us, this really seems as if the root of the problem were that the starttls timeout ends up being 0.1 seconds, which is too short if there're any latencies in the network. What would be the correct place to fix this? It appears to me that you should be able to say "timeout extended=5" or something similar in a config file, but in back-ldap/config.c the "extended" timeout option is commented out as unimplemented. So, what would be required to implement it?
Relevant files:
back-ldap/bind.c (ldap_back_start_tls function, setting of tv using LDAP_BACK_TV_SET macro) back-ldap/back-ldap.h (defining the LDAP_BACK_TV_SET to basically set the timeout to 0.1 s) back-ldap/config.f (definition of timeout_table)
Best,
Janne / Helsinki Uni