Rallavagu Kon wrote:
All,
Deployed OpenLDAP 2.4 in production (with replication) mainly serving saslauthd and sendmail on Linux with client library 2.4.49. We are experiencing sporadic errors "ldap_simple_bind() failed -1 (Can't contact LDAP server).”. This issue happens on both saslauthd and on openldap when replicating to other service. However, upon investigation we have found that this error is not disruptive as both services (saslauthd and openldap server) have “retry” options built-in and subsequent request (immediately after a failure) receives successful response. Also, noticed that this issue manifests when communication occurs via ELB (AWS) and no connect issues were manifested when clients are directly pointed to openldap server (all connections are TLS). This makes me infer (suspect) that openldap client library might be caching (or tcp alive etc.) the connection which is not working well with ELB (Elastic Load Balancer). Wondering if the connection is really cached and is there a configuration parameter that I can try and tune the behavior (tried to chase this down looking into the source code of client library but could not locate it, perhaps I need to look more but wondering if someone else in the community has any experience in this regard).
Connection caching in libldap was a feature like ~20 years ago but removed because it was difficult to configure/use properly. So no, there's no such feature in any recent libldap. When you do a ldap_initialize() / ldap_bind() sequence you're getting a new TCP connection. Sounds like your load balancer is buggy.
Thank You.