Am Dienstag 05 Mai 2009 22:48:10 schrieb Howard Chu:
Ralf Haferkamp wrote:
> Am Freitag 01 Mai 2009 11:50:15 schrieb masarati(a)aero.polimi.it:
>>> since quite some time libldap enables tcp-keepalive, e.g. to detected
>>> syncrepl connections. However the default timeout of two hours that
>>> most systems are using might be a bit too long for some applications
>>> (e.g. I had a
>>> problem lately were nscd didn't answer queries anymore because nss_ldap
>>> blocking in SSL_read() while the underlying connection has been cut
>>> off). On
>>> the other hand messing with the system wide settings might no be a good
>>> either. On Linux it is possible to configure the keepalive settings on
>>> a per
>>> socket basis through the TCP_KEEP* socket options.
>>> Would it be worth adding ldap_set_option() support for those, even if
>>> they are
>>> not really portable?
>> I think it would; for archs that do not support it, it could do nothing
>> (and log accordingly, just in case).
> Ok, I'll introduce the following new options for keepalive support then:
> LDAP_OPT_X_KEEPALIVE_IDLE 0x6300
> LDAP_OPT_X_KEEPALIVE_PROBES 0x6301
> LDAP_OPT_X_KEEPALIVE_INTERVAL 0x6302
> We might also think about adding support to set those values for syncrepl
> and back-ldap/back-meta.
I'd prefer a portable solution vs something so extremely
platform-dependent. As already discussed many times before, we just need a
client to send a periodic LDAP no-op message to get the same effect.
(Abandon 0 will work fine.)
Something like proposed in ITS#5133? It seems that it
was rejected with a
reference to the enablement of SO_KEEPALIVE, though. Should we revisit that?
My problem was not so much with syncrepl though, I had nss_ldap making me
While it's not as general purpose as setting a
keepalive in the socket layer, I think we only need to worry about the
syncrepl client. back-ldap/meta already have their own retry mechanisms,
they can take care of themselves.
There seems to be a problem with many retry
mechanisms when it comes to the
scenario I described in my orignial post. On a TLS protected connection
SSL_read (called from ldap_result) might trigger multiple read() calls. As
there are no select/poll calls inbetween them, one of those read()s might
block forever (until TCP keepalive kicks in) in case the server is not
answering anymore and didn't close the connection correctly (power failure,
I havn't had a good idea yet how to easily fix this case, apart from
leveraging TCP keepalives.
(According to the docs, SSL_read() would return SSL_ERROR_WANT_READ when the
underlying BIO is non-blocking. But we're using blocking IO. I am unsure how
much effort it would be to port that to non-blocking. I'd think it's a non-
trivial task ;)).
So - I'd rather see an option for a periodic LDAP ping added to
syncrepl client - that will work uniformly across all platforms.
And in general - I am opposed to any code that causes our feature set /
behavior to differ from platform to platform.
Understandable, that's why I was
asking before commiting anything. But AFAIK
we have plattform specific issues in other places as well. (Or think about the
various different LDAP_OPT_X_TLS-settings depending on which underlying SSL
implementation is used.)