olli.salli@vincit.fi wrote:
Full_Name: Olli Salli Version: git master OS: Windows 8.1, Linux 3.18.10 URL: ftp://ftp.openldap.org/incoming/olli-salli-150325.patch Submission from: (NULL) (83.102.45.242)
Thanks for the report, fixed now in master.
We are working on an application which needs to perform some simple LDAP search queries every once in a while. The application is running as a daemon in an embedded server environment and has no user interface, and is instead remotely controlled and configured via a control TCP connection. This includes the configuration specifying the LDAP server address and port, and whether the LDAP queries should be attempted at all (if there is no server available). We are currently developing on Windows 8.1 with Visual Studio 2013 but will also run in Linux environments.
To ensure the control TCP connection stays alive at all times (and the daemon otherwise functional), and to avoid using threads, whahave used the openldap asynchronous APIs, including the asynchronous connect option - if the configured LDAP server is unreachable, the initial search query can block for a very long time otherwise. It is here that we have hit a small issue.
When using LDAP_OPT_CONNECT_ASYNC, if the LDAP server is unreachable, the initial request (e.g. using ldap_search_ext) does not block, which is correct. However, this first call returns LDAP_CONNECT_ERROR. If we disregard this, and continue reissuing the ldap_search_ext request periodically, following calls correctly return LDAP_X_CONNECTING. Then when the NETWORK_TIMEOUT has elapsed, LDAP_CONNECT_ERROR is returned again, which is correct.
LDAP_CONNECT_ERROR might result in the first ldap_search_ext call from legitimate error conditions in connect() even in asynchronous mode, for example out-of-resource conditions (EADDRNOTAVAIL), or all local network interfaces being down (ENETUNREACH), etc. These should be handled as fatal errors, but would be impossible to distinguish from the false initial LDAP_CONNECT_ERROR resulting from using LDAP_OPT_CONNECT_ASYNC.
The issue seems to be in ldap_send_initial_request (http://www.openldap.org/devel/gitweb.cgi?p=openldap.git;a=blob;f=libraries/b...).
When async connect is enabled what happens during the first request is:
- sd is initialized to AC_SOCKET_INVALID
- ber_sockbuf_ctrl( ... LBER_SB_OPT_GET_FD ... ) is called to determine whether
there is already a connection, and to fetch its socket descriptor to sd 3) as there is no connection, it returns -1 and sd stays AC_SOCKET_INVALID 4) a new connection is formed using ldap_open_defconn() 5) ldap_int_check_async_open( ld, sd ) is called, but sd is still AC_SOCKET_INVALID, and thus the poll fails 6) LDAP_CONNECT_ERROR is returned
On successive calls, what happens is
- ber_sockbuf_ctrl( ... LBER_SB_OPT_GET_FD ... ) returns success and a valid
socket descriptor 2) opening a new connection is skipped 3) ldap_int_check_async_open( ld, sd ) is called this time with a valid socket descriptor 4) the poll works as intended
A simple fix is to simply reissue ber_sockbuf_ctrl( ... LBER_SB_OPT_GET_FD ... ) after opening the connection. This fixes the first poll to return LDAP_X_CONNECTING as intended. This is implemented in the patch at the URL. A perhaps more semantically correct alternative could be to return the created socket descriptor from ldap_open_defconn().