Full_Name: Vernon Smith
Version: 2.4.47
OS: Linux, Solaris
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (2601:40d:4300:679a:d18c:3060:e826:d35c)
I am using libldap library built with -DLDAP_USE_NON_BLOCKING_TLS and configured
for Async connection mode. I test making connections using the library to
servers that are hung to verify that my application will not hang in those
cases. I have found 3 issues. The first is that the ldap_pvt_connect() clears
non-blocking socket setup after the connection is made even when Async mode was
configured. So here is my patch for that.
diff --git a/libraries/libldap/os-ip.c b/libraries/libldap/os-ip.c
index a823cc6..d7927e5 100644
--- a/libraries/libldap/os-ip.c
+++ b/libraries/libldap/os-ip.c
@@ -443,7 +443,7 @@ ldap_pvt_connect(LDAP *ld, ber_socket_t s,
if ( connect(s, sin, addrlen) != AC_SOCKET_ERROR ) {
osip_debug(ld, "connect success\n", 0, 0, 0);
- if ( opt_tv && ldap_pvt_ndelay_off(ld, s) == -1 )
+ if ( !async && opt_tv && ldap_pvt_ndelay_off(ld, s) == -1 )
return ( -1 );
return ( 0 );
}
The second issue is that the tlso_session_connect() routine does not correctly
handle the return code from SSL_connect(), it just returns it to the caller. For
LDAP_USE_NON_BLOCKING_TLS, the return code must be checked and an appropriate
return used. Here is my patch.
diff --git a/libraries/libldap/tls_o.c b/libraries/libldap/tls_o.c
index e95a448..7a31b5e 100644
--- a/libraries/libldap/tls_o.c
+++ b/libraries/libldap/tls_o.c
@@ -531,7 +531,23 @@ tlso_session_connect( LDAP *ld, tls_session *sess )
tlso_session *s = (tlso_session *)sess;
/* Caller expects 0 = success, OpenSSL returns 1 = success */
- return SSL_connect( s ) - 1;
+ int rc = SSL_connect( s ) - 1;
+
+#ifdef LDAP_USE_NON_BLOCKING_TLS
+ int sslerr = SSL_get_error(s, rc+1);
+ int sockerr = sock_errno();
+
+ if ( rc < 0 ) {
+ if ( sslerr == SSL_ERROR_WANT_READ || sslerr == SSL_ERROR_WANT_WRITE ) {
+ rc = 0;
+ } else if (( sslerr == SSL_ERROR_SYSCALL ) &&
+ ( sockerr == EAGAIN || sockerr == ENOTCONN )) {
+ rc = 0;
+ }
+ }
+#endif /* LDAP_USE_NON_BLOCKING_TLS */
+
+ return rc;
}
static int
The third issue is that ldap_int_tls_start() compiled with
-DLDAP_USE_NON_BLOCKING_TLS Plays with the socket non-blocking setting even if
Async mode is configured. Here is mt patch to only play with the socket setting
if not in Async mode.
diff --git a/libraries/libldap/tls2.c b/libraries/libldap/tls2.c
index d9b2d27..69b749c 100644
--- a/libraries/libldap/tls2.c
+++ b/libraries/libldap/tls2.c
@@ -1075,8 +1075,11 @@ ldap_int_tls_start ( LDAP *ld, LDAPConn *conn,
LDAPURLDesc *srv )
/*
* Use non-blocking io during SSL Handshake when a timeout is configured
*/
+ int async = LDAP_BOOL_GET( &ld->ld_options, LDAP_BOOL_CONNECT_ASYNC );
if ( ld->ld_options.ldo_tm_net.tv_sec >= 0 ) {
- ber_sockbuf_ctrl( sb, LBER_SB_OPT_SET_NONBLOCK, (void*)1 );
+ if ( ! async ) {
+ ber_sockbuf_ctrl( sb, LBER_SB_OPT_SET_NONBLOCK, (void*)1 );
+ }
ber_sockbuf_ctrl( sb, LBER_SB_OPT_GET_FD, &sd );
tv = ld->ld_options.ldo_tm_net;
tv0 = tv;
@@ -1110,8 +1113,10 @@ ldap_int_tls_start ( LDAP *ld, LDAPConn *conn,
LDAPURLDesc *srv )
ld->ld_errno = LDAP_TIMEOUT;
break;
} else {
- /* ldap_int_poll called ldap_pvt_ndelay_off */
- ber_sockbuf_ctrl( sb, LBER_SB_OPT_SET_NONBLOCK, (void*)1 );
+ /* ldap_int_poll called ldap_pvt_ndelay_off if not in async mode */
+ if ( ! async ) {
+ ber_sockbuf_ctrl( sb, LBER_SB_OPT_SET_NONBLOCK, (void*)1 );
+ }
ret = ldap_int_tls_connect( ld, conn, host );
if ( ret > 0 ) { /* need to call tls_connect once more */
struct timeval curr_time_tv, delta_tv;
@@ -1159,7 +1164,9 @@ ldap_int_tls_start ( LDAP *ld, LDAPConn *conn, LDAPURLDesc
*srv )
}
}
if ( ld->ld_options.ldo_tm_net.tv_sec >= 0 ) {
- ber_sockbuf_ctrl( sb, LBER_SB_OPT_SET_NONBLOCK, NULL );
+ if ( ! async ) {
+ ber_sockbuf_ctrl( sb, LBER_SB_OPT_SET_NONBLOCK, NULL );
+ }
}
#endif /* LDAP_USE_NON_BLOCKING_TLS */
Thanks, Vern
=C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason wrote:
> On Mon, Feb 18, 2019 at 5:18 PM Howard Chu <hyc(a)symas.com> wrote:
>>
>> avarab(a)gmail.com wrote:
>>> Full_Name: .var Arnfj.r. Bjarmason
>>> Version: 2.4.44-21
>>> OS: CentOS 7.6
>>> URL: ftp://ftp.openldap.org/incoming/
>>> Submission from: (NULL) (5.57.21.154)
>>>
>>>
>>> On a setup where you have a blackholed DNS server:
>>>
>>> $ grep ^name /etc/resolv.conf
>>> nameserver 1.2.3.4
>>>
>>> Running e.g.:
>>>
>>> $ time ldapsearch -l 2 -o nettimeout=3D1 [...]
>>>
>>> Will (on my system) eventually return:
>>>
>>> ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
>>>
>>> real 0m24.039s
>>> user 0m0.004s
>>> sys 0m0.007s
>>>
>>> It'll take around 48 seconds if I have two DNS servers. Running strac=
e(1) on it
>>> reveals that it's sitting in a socket/connect/poll loop trying to loo=
kup the
>>> hostname of the LDAP server I'm trying to talk to.
>>>
>>> Instead one of these options should limit time spent on DNS lookups, =
or there
>>> should be another option, so that you can run ldapsearch with a combi=
nation of
>>> these options and be sure that it'll run in at most the <timeout> you=
give it.
>>
>> The current code in libldap uses gethostbyname() and this API doesn't =
offer
>> any parameters for setting a timeout. You can configure a timeout in y=
our
>> system's /etc/resolv.conf or using RES_OPTIONS environment variable. R=
ead your
>> system's resolver(5) manpage.
>>
>> If you know of any widely available resolver API that allows specifyin=
g a timeout
>> on individual queries, you're welcome to submit a patch supporting thi=
s feature.
>> The standard -lresolv doesn't provide such an interface.
>=20
> That resolv.conf had "options timeout:2". That it didn't kick in is
> probably a bug somewhere else, but it would still be handy if
> ldapsearch had this itself, since the timeout you use for general
> resolutions might not be what you want for ldapsearch.
If all you need is a timeout in the ldapsearch command, wrap it in a scri=
pt
that sets the RES_OPTIONS environment variable.
> But implementing that is a mess. This SO post has a summary:
> https://stackoverflow.com/questions/24403435/socket-hostname-lookup-tim=
eout-how-to-implement-it
>=20
> So e.g. there could be a probe for getaddrinfo_a() on GNU systems.
>=20
>> Closing this ITS.
>=20
> Fair enough, but is that a "we don't want this ever" or "if someone
> submitted a patch for (possibly OS-specific) DNS timeout we'd be
> interested".
Generally not fond of features that aren't available cross-platform, part=
icularly
if that means it can only be tested on a particular OS. I guess for GNU e=
xtensions
that are easily detected and widely supported it may be OK.
We already use getaddrinfo() if it's available, so adding getaddrinfo_a()=
may
not be too messy.
--=20
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
--On Monday, February 18, 2019 4:32 PM +0000 avarab(a)gmail.com wrote:
>> Closing this ITS.
>
> Fair enough, but is that a "we don't want this ever" or "if someone
> submitted a patch for (possibly OS-specific) DNS timeout we'd be
> interested".
This was already answered in Howard's reply:
> If you know of any widely available resolver API that allows specifying a
timeout
> on individual queries, you're welcome to submit a patch supporting this
feature.
> The standard -lresolv doesn't provide such an interface.
I.e., the ITS can always be reopened if a working resolver API can be found
and a patch is supplied.
--Quanah
--
Quanah Gibson-Mount
Product Architect
Symas Corporation
Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
<http://www.symas.com>
On Mon, Feb 18, 2019 at 5:18 PM Howard Chu <hyc(a)symas.com> wrote:
>
> avarab(a)gmail.com wrote:
> > Full_Name: .var Arnfj.r. Bjarmason
> > Version: 2.4.44-21
> > OS: CentOS 7.6
> > URL: ftp://ftp.openldap.org/incoming/
> > Submission from: (NULL) (5.57.21.154)
> >
> >
> > On a setup where you have a blackholed DNS server:
> >
> > $ grep ^name /etc/resolv.conf
> > nameserver 1.2.3.4
> >
> > Running e.g.:
> >
> > $ time ldapsearch -l 2 -o nettimeout=1 [...]
> >
> > Will (on my system) eventually return:
> >
> > ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
> >
> > real 0m24.039s
> > user 0m0.004s
> > sys 0m0.007s
> >
> > It'll take around 48 seconds if I have two DNS servers. Running strace(1) on it
> > reveals that it's sitting in a socket/connect/poll loop trying to lookup the
> > hostname of the LDAP server I'm trying to talk to.
> >
> > Instead one of these options should limit time spent on DNS lookups, or there
> > should be another option, so that you can run ldapsearch with a combination of
> > these options and be sure that it'll run in at most the <timeout> you give it.
>
> The current code in libldap uses gethostbyname() and this API doesn't offer
> any parameters for setting a timeout. You can configure a timeout in your
> system's /etc/resolv.conf or using RES_OPTIONS environment variable. Read your
> system's resolver(5) manpage.
>
> If you know of any widely available resolver API that allows specifying a timeout
> on individual queries, you're welcome to submit a patch supporting this feature.
> The standard -lresolv doesn't provide such an interface.
That resolv.conf had "options timeout:2". That it didn't kick in is
probably a bug somewhere else, but it would still be handy if
ldapsearch had this itself, since the timeout you use for general
resolutions might not be what you want for ldapsearch.
But implementing that is a mess. This SO post has a summary:
https://stackoverflow.com/questions/24403435/socket-hostname-lookup-timeout…
So e.g. there could be a probe for getaddrinfo_a() on GNU systems.
> Closing this ITS.
Fair enough, but is that a "we don't want this ever" or "if someone
submitted a patch for (possibly OS-specific) DNS timeout we'd be
interested".
> > As a workaround I'm using ldapsearch with /usr/bin/timeout, but since it kills
> > it if it exceeds the timeout I don't get a meaningful error.
>
>
> --
> -- Howard Chu
> CTO, Symas Corp. http://www.symas.com
> Director, Highland Sun http://highlandsun.com/hyc/
> Chief Architect, OpenLDAP http://www.openldap.org/project/
avarab(a)gmail.com wrote:
> Full_Name: .var Arnfj.r. Bjarmason
> Version: 2.4.44-21
> OS: CentOS 7.6
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (5.57.21.154)
>
>
> On a setup where you have a blackholed DNS server:
>
> $ grep ^name /etc/resolv.conf
> nameserver 1.2.3.4
>
> Running e.g.:
>
> $ time ldapsearch -l 2 -o nettimeout=1 [...]
>
> Will (on my system) eventually return:
>
> ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
>
> real 0m24.039s
> user 0m0.004s
> sys 0m0.007s
>
> It'll take around 48 seconds if I have two DNS servers. Running strace(1) on it
> reveals that it's sitting in a socket/connect/poll loop trying to lookup the
> hostname of the LDAP server I'm trying to talk to.
>
> Instead one of these options should limit time spent on DNS lookups, or there
> should be another option, so that you can run ldapsearch with a combination of
> these options and be sure that it'll run in at most the <timeout> you give it.
The current code in libldap uses gethostbyname() and this API doesn't offer
any parameters for setting a timeout. You can configure a timeout in your
system's /etc/resolv.conf or using RES_OPTIONS environment variable. Read your
system's resolver(5) manpage.
If you know of any widely available resolver API that allows specifying a timeout
on individual queries, you're welcome to submit a patch supporting this feature.
The standard -lresolv doesn't provide such an interface.
Closing this ITS.
> As a workaround I'm using ldapsearch with /usr/bin/timeout, but since it kills
> it if it exceeds the timeout I don't get a meaningful error.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Full_Name: .var Arnfj.r. Bjarmason
Version: 2.4.44-21
OS: CentOS 7.6
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (5.57.21.154)
On a setup where you have a blackholed DNS server:
$ grep ^name /etc/resolv.conf
nameserver 1.2.3.4
Running e.g.:
$ time ldapsearch -l 2 -o nettimeout=1 [...]
Will (on my system) eventually return:
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
real 0m24.039s
user 0m0.004s
sys 0m0.007s
It'll take around 48 seconds if I have two DNS servers. Running strace(1) on it
reveals that it's sitting in a socket/connect/poll loop trying to lookup the
hostname of the LDAP server I'm trying to talk to.
Instead one of these options should limit time spent on DNS lookups, or there
should be another option, so that you can run ldapsearch with a combination of
these options and be sure that it'll run in at most the <timeout> you give it.
As a workaround I'm using ldapsearch with /usr/bin/timeout, but since it kills
it if it exceeds the timeout I don't get a meaningful error.
khng300(a)gmail.com wrote:
> Full_Name: Ka Ho Ng
> Version: mdb.master
> OS: Windows 10
> URL: ftp://ftp.openldap.org/incoming/Ka-Ho-Ng-190218.patch
> Submission from: (NULL) (2001:470:fa95:1300::1)
Thanks for the patch, committed to mdb.master
>
> Fix mdb_env_open2() failing when getting handle for NTDLL.dll
>
> Always call GetModuleHandleW() with Unicode string, as mdb_fopen() is
> calling CreateFileW() already. Otherwise GetModuleHandle() will fail with
> non-Unicode string.
>
> I, Ka Ho Ng, hereby place the following modifications to OpenLDAP Software (and
> only these modifications) into the public domain. Hence, these modifications may
> be freely used and/or redistributed for any purpose with or without attribution
> and/or other notice.
>
>
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Full_Name: Ka Ho Ng
Version: mdb.master
OS: Windows 10
URL: ftp://ftp.openldap.org/incoming/Ka-Ho-Ng-190218.patch
Submission from: (NULL) (2001:470:fa95:1300::1)
Fix mdb_env_open2() failing when getting handle for NTDLL.dll
Always call GetModuleHandleW() with Unicode string, as mdb_fopen() is
calling CreateFileW() already. Otherwise GetModuleHandle() will fail with
non-Unicode string.
I, Ka Ho Ng, hereby place the following modifications to OpenLDAP Software (and
only these modifications) into the public domain. Hence, these modifications may
be freely used and/or redistributed for any purpose with or without attribution
and/or other notice.
On Fri, Feb 15, 2019 at 5:16 PM Howard Chu <hyc(a)symas.com> wrote:
>
> matthieu.guegan(a)smile-suisse.com wrote:
> > On Fri, Feb 15, 2019 at 4:33 PM Howard Chu <hyc(a)symas.com> wrote:
> >>
> >> Matthieu GUEGAN wrote:
> >>> On Fri, Feb 15, 2019 at 3:56 PM Howard Chu <hyc(a)symas.com> wrote:
> >>>>
> >>>> mguegan(a)virtua.ch wrote:
> >>>>> Full_Name: Matthieu Guegan
> >>>>> Version: current
> >>>>> OS: SmartOS
> >>>>> URL:
> >>>>> Submission from: (NULL) (109.190.148.77)
> >>>>>
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> In order to compile knot[1] on SmartOS, I have done a series of little patches.
> >>>>> One of them is linked to the LMDB project, which redirects me here.
> >>>>>
> >>>>> The idea is to tell SmartOS (and I think the SunOS family) to make use of
> >>>>> posix_madvise(3), instead of madvise(3) as implementation on this OS[2] is
> >>>>> different than, for example, the Linux[3] one.
> >>>>>
> >>>>> So, I kindly ask you to take a look at the merge request of the knot project[4],
> >>>>> in particular the `src/contrib/lmdb/mdb.c` patch (that's just a ifdef
> >>>>> modification) : would it be possible to apply this patch directly in this
> >>>>> repository ?
> >>>>
> >>>> I see no functional difference between your references [2] and [3].
> >>>> What is the benefit of this patch?
> >>>
> >>> Well, the Solaris/SunOS's madvise(3) is defined with an old `caddr_t`
> >>> type whereas Linux (and *BSD OSes too) use a more modern approach with
> >>> `void *`.
> >>> On the `knot` project, which embed the lmdb part of openldap,
> >>> compilation failed when using madvise(3) on SmartOS, so I did a
> >>> replacement with posix_madvise(3) which uses the `void *` argument
> >>> too.
> >>> I don't know what's the best way to handle this case, that's why I ask here.
> >>
> >> A caddr_t is just a typedef of "char *". It should always be legal to
> >> pass a void * to such a parameter. What is your compile error message?
> >>
> >
> > ```
> > contrib/lmdb/mdb.c: In function 'mdb_env_map':
> > contrib/lmdb/mdb.c:3998:3: error: implicit declaration of function
> > 'madvise'; did you mean 'raise'?
> > [-Werror=implicit-function-declaration]
> > madvise(env->me_map, env->me_mapsize, MADV_RANDOM);
> > ^~~~~~~
> > raise
>
> This has absolutely nothing to do with whether the parameter is a caddr_t or a void *.
> Why would you even mention something totally irrelevant to the actual error?
Sorry if I mislead you with that, that was not my intention.
I was just trying to answer you when asking the difference using
posix_madvise() vs madvise() on SmartOS.
And that was the only thing I found. Indeed, it seems completely
irrelevant to the error I get.
> Strange that you have MADV_RANDOM defined but no declaration of madvise().
Indeed, searching in the header file `sys/mman.h` [5], it seems to be
possible to have a condition that enable madvise() and not MADV_RANDOM
or vice-versa.
> Sounds like your system's header files are playing stupid games with definitions.
> If you can find out what flags must be used to make the header files properly
> declare the madvise function, a patch for that may be acceptable.
I should say that I'm using the pkgsrc framework in order to compile
the whole thing and found multiple packages on which was applied the
same type of patch (excluding __sun when trying to use madvise(), thus
using the posix_madvise variant).
Flags can be overridden here, so it must be something related with the
pkgsrc's Makefile variables.
Anyway, maybe this should be done this way, instead of trying to
change things upstream. Thanks for your help.
[5] https://github.com/illumos/illumos-gate/blob/4e0c5eff9af325c80994e9527b7cb8…
matthieu.guegan(a)smile-suisse.com wrote:
> On Fri, Feb 15, 2019 at 4:33 PM Howard Chu <hyc(a)symas.com> wrote:
>>
>> Matthieu GUEGAN wrote:
>>> On Fri, Feb 15, 2019 at 3:56 PM Howard Chu <hyc(a)symas.com> wrote:
>>>>
>>>> mguegan(a)virtua.ch wrote:
>>>>> Full_Name: Matthieu Guegan
>>>>> Version: current
>>>>> OS: SmartOS
>>>>> URL:
>>>>> Submission from: (NULL) (109.190.148.77)
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> In order to compile knot[1] on SmartOS, I have done a series of little patches.
>>>>> One of them is linked to the LMDB project, which redirects me here.
>>>>>
>>>>> The idea is to tell SmartOS (and I think the SunOS family) to make use of
>>>>> posix_madvise(3), instead of madvise(3) as implementation on this OS[2] is
>>>>> different than, for example, the Linux[3] one.
>>>>>
>>>>> So, I kindly ask you to take a look at the merge request of the knot project[4],
>>>>> in particular the `src/contrib/lmdb/mdb.c` patch (that's just a ifdef
>>>>> modification) : would it be possible to apply this patch directly in this
>>>>> repository ?
>>>>
>>>> I see no functional difference between your references [2] and [3].
>>>> What is the benefit of this patch?
>>>
>>> Well, the Solaris/SunOS's madvise(3) is defined with an old `caddr_t`
>>> type whereas Linux (and *BSD OSes too) use a more modern approach with
>>> `void *`.
>>> On the `knot` project, which embed the lmdb part of openldap,
>>> compilation failed when using madvise(3) on SmartOS, so I did a
>>> replacement with posix_madvise(3) which uses the `void *` argument
>>> too.
>>> I don't know what's the best way to handle this case, that's why I ask here.
>>
>> A caddr_t is just a typedef of "char *". It should always be legal to
>> pass a void * to such a parameter. What is your compile error message?
>>
>
> ```
> contrib/lmdb/mdb.c: In function 'mdb_env_map':
> contrib/lmdb/mdb.c:3998:3: error: implicit declaration of function
> 'madvise'; did you mean 'raise'?
> [-Werror=implicit-function-declaration]
> madvise(env->me_map, env->me_mapsize, MADV_RANDOM);
> ^~~~~~~
> raise
This has absolutely nothing to do with whether the parameter is a caddr_t or a void *.
Why would you even mention something totally irrelevant to the actual error?
Strange that you have MADV_RANDOM defined but no declaration of madvise().
Sounds like your system's header files are playing stupid games with definitions.
If you can find out what flags must be used to make the header files properly
declare the madvise function, a patch for that may be acceptable.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/