https://bugs.openldap.org/show_bug.cgi?id=9474
Issue ID: 9474 Summary: ldap_install_tls() should return meaningful error code Product: OpenLDAP Version: 2.4.57 Hardware: All OS: All Status: UNCONFIRMED Severity: normal Priority: --- Component: libraries Assignee: bugs@openldap.org Reporter: simon.pichugin@gmail.com Target Milestone: ---
The description of my findings (take a note that these are OpenLDAP logs that happen under the application that uses libldap):
[sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: tls_write: want=610, written=610 ... [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: TLS trace: SSL_connect:SSLv3 flush data [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: tls_read: want=5 error=Interrupted system call [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: TLS trace: SSL_connect:error in SSLv3 read finished A [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: TLS trace: SSL_connect:error in SSLv3 read finished A [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: TLS: can't connect: . [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: ldap_free_connection 1 1 [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: ldap_send_unbind [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: ber_flush2: 7 bytes to sd 23 [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: 0000: 00 05 00 01 00 42 00 [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: ldap_write: want=7, written=7 [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: 0000: 00 05 00 01 01 42 00 [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: ldap_free_connection: actually freed
So, 'error=Interrupted system call' is caught by this: https://git.openldap.org/openldap/openldap/-/blob/master/libraries/libldap/t... https://git.openldap.org/openldap/openldap/-/blob/master/libraries/liblber/s... It is only the debug message that comes from the caller itself so we can see what is passed to OpenSSL. And 'Interrupted system call' is just an EINTR string representation.
What we should do is to catch the error that OpenSSL returns to us after it is interrupted.
As we can see from the logs: "libldap: TLS: can't connect: ." This line returns nothing: https://git.openldap.org/openldap/openldap/-/blob/master/libraries/libldap/t... So 'ld->ld_error' is set to empty value.
If we go deeper into the 'tls_imp->ti_session_errmsg' call we can reach the point where ERR_peek_error() is called: https://git.openldap.org/openldap/openldap/-/blob/master/libraries/libldap/t... https://git.openldap.org/openldap/openldap/-/blob/master/libraries/libldap/t... https://git.openldap.org/openldap/openldap/-/blob/master/libraries/libldap/t...
In the conclusion: ldap_install_tls() should return meaningful error code that would allow to figure out a reason for the failure, especially network IO fail due to EITR.
https://bugs.openldap.org/show_bug.cgi?id=9474
--- Comment #1 from Howard Chu hyc@openldap.org --- (In reply to Simon Pichugin from comment #0)
The description of my findings (take a note that these are OpenLDAP logs that happen under the application that uses libldap):
[sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: tls_write:
want=610, written=610 ... [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: TLS trace: SSL_connect:SSLv3 flush data [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: tls_read: want=5 error=Interrupted system call [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: TLS trace: SSL_connect:error in SSLv3 read finished A [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: TLS trace: SSL_connect:error in SSLv3 read finished A [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: TLS: can't connect: . [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: ldap_free_connection 1 1 [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: ldap_send_unbind [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: ber_flush2: 7 bytes to sd 23 [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: 0000: 00 05 00 01 00 42 00 [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: ldap_write: want=7, written=7 [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: 0000: 00 05 00 01 01 42 00 [sssd[be[LDAP]]] [sss_ldap_debug] (0x4000): libldap: ldap_free_connection: actually freed
So, 'error=Interrupted system call' is caught by this: https://git.openldap.org/openldap/openldap/-/blob/master/libraries/libldap/ tls2.c#L360 https://git.openldap.org/openldap/openldap/-/blob/master/libraries/liblber/ sockbuf.c#L829 It is only the debug message that comes from the caller itself so we can see what is passed to OpenSSL. And 'Interrupted system call' is just an EINTR string representation.
What we should do is to catch the error that OpenSSL returns to us after it is interrupted.
As we can see from the logs: "libldap: TLS: can't connect: ." This line returns nothing: https://git.openldap.org/openldap/openldap/-/blob/master/libraries/libldap/ tls2.c#L427 So 'ld->ld_error' is set to empty value.
If we go deeper into the 'tls_imp->ti_session_errmsg' call we can reach the point where ERR_peek_error() is called: https://git.openldap.org/openldap/openldap/-/blob/master/libraries/libldap/ tls2.c#L416 https://git.openldap.org/openldap/openldap/-/blob/master/libraries/libldap/ tls_o.c#L563 https://git.openldap.org/openldap/openldap/-/blob/master/libraries/libldap/ tls_o.c#L569
In the conclusion: ldap_install_tls() should return meaningful error code that would allow to figure out a reason for the failure, especially network IO fail due to EITR.
Sorry, your request is still vague. "Should return meaningful error code" - which error code are you asking to be returned? The EINTR is returned from the I/O layer to the TLS library, so it's up to the TLS library to give that back to libldap. If ERR_peek_error() gives us an empty string then there's nothing else for us to return.
https://bugs.openldap.org/show_bug.cgi?id=9474
--- Comment #2 from Simon Pichugin simon.pichugin@gmail.com --- (In reply to Howard Chu from comment #1)
Sorry, your request is still vague. "Should return meaningful error code" - which error code are you asking to be returned? The EINTR is returned from the I/O layer to the TLS library, so it's up to the TLS library to give that back to libldap. If ERR_peek_error() gives us an empty string then there's nothing else for us to return.
ERR_peek_error() returns the earliest error code from the thread's error queue.
My assumption is that the queue may have a few errors at the moment when ERR_peek_error() function is called.
I understand why ERR_peek_error() is chosen here but we can make libldap more robust and check other errors if ERR_peek_error() has an empty value. If that's the case...
What do you think?
https://bugs.openldap.org/show_bug.cgi?id=9474
--- Comment #3 from Howard Chu hyc@openldap.org --- (In reply to Simon Pichugin from comment #2)
(In reply to Howard Chu from comment #1)
Sorry, your request is still vague. "Should return meaningful error code" - which error code are you asking to be returned? The EINTR is returned from the I/O layer to the TLS library, so it's up to the TLS library to give that back to libldap. If ERR_peek_error() gives us an empty string then there's nothing else for us to return.
ERR_peek_error() returns the earliest error code from the thread's error queue.
My assumption is that the queue may have a few errors at the moment when ERR_peek_error() function is called.
I understand why ERR_peek_error() is chosen here but we can make libldap more robust and check other errors if ERR_peek_error() has an empty value. If that's the case...
What do you think?
Feel free to post a patch demonstrating your idea.
https://bugs.openldap.org/show_bug.cgi?id=9474
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://bugs.openldap.org/s | |how_bug.cgi?id=9157
https://bugs.openldap.org/show_bug.cgi?id=9474
--- Comment #4 from Simon Pichugin simon.pichugin@gmail.com --- Okay, after a deeper investigation we found that both ERR_peek_error() and ERR_peek_last_error() are empty when OpenSSL is interruted while doing read() or poll() so it's impossible to get the verbose info from ld->ld_error (as it's empty).
But we found a place where we can retrieve the error - 'errno'. After ldap_install_tls() fails, 'errno == EINTR' which describes exactly the cause.
So I'd like to change the purpose of this bug (or I can open a new one if you say me do so).
Could we please update doc/man/man3/ldap_tls.3 and describe there that the libldap caller can rely on 'errno' value if ldap_install_tls() just has failed?
I know that 'errno' can be changed in certain cases and we want to be sure that libldap won't change 'errno' after ldap_install_tls() failure. So we can use it safely and write something like this:
lret = ldap_install_tls(state->ldap); if (lret != LDAP_SUCCESS) { if (errno == EINTR) { /* we can retry later in certain cases */ DEBUG(SSSDBG_CRIT_FAILURE, "ldap_install_tls failed: connect was interrupted\n"); sss_log(SSS_LOG_ERR, "Could not start TLS encryption. [%d] [%s]", errno, strerror(errno)); ret = errno; goto fail; } else { /* more error processing with ld->ld_error */ ... } }
Should I suggest a patch for the man page? If you are okay with the approach...
https://bugs.openldap.org/show_bug.cgi?id=9474
--- Comment #5 from Quanah Gibson-Mount quanah@openldap.org --- (In reply to Simon Pichugin from comment #4)
Should I suggest a patch for the man page? If you are okay with the approach...
OpenLDAP supports multiple TLS implementations where this behavior can differ depending on implementation, so a man page patch is not appropriate. Thank you though!
https://bugs.openldap.org/show_bug.cgi?id=9474
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |WONTFIX Status|UNCONFIRMED |RESOLVED
https://bugs.openldap.org/show_bug.cgi?id=9474
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |VERIFIED