Hi Kartik,
On Fri, Aug 03, 2018 at 11:19:06AM -0400, Kartik Subbarao wrote:
I'm running into a problem with slapd 2.4.46 hanging on Ubuntu 18.04, which seems to be a side effect of the ITS#8650 patch:
https://github.com/openldap/openldap/commit/7b5181da8cdd47a13041f9ee36fa9590...
slapd will run fine for a while, but during some periods of high-traffic, it'll hang. It'll peg the CPU at 100% and won't respond to any new LDAP connections. After some time, it'll resume working again, but overall it's fairly unreliable.
Thanks for letting me know about this. This patch is running on quite a few systems by now, I'm sorry the problem wasn't caught sooner. :/
I'm wondering if there is a better way to handle EAGAIN returned from gnutls_handshake(), instead of doing a busywait as in ITS#8650, or my simplistic attempt at inserting a sleep() call which doesn't really seem to help. I'm wondering how the GnuTLS developers intend for people to use gnutls_handshake() properly, so as to gracefully handle sessions that involve long packets on the one hand, without opening up a vulnerability to chew up lots of system resources on the other hand.
Right. I mean, this is how GnuTLS' own example shows to do it:
https://gitlab.com/gnutls/gnutls/blob/master/doc/examples/ex-client-dtls.c#L...
We could place a limit on the number of iterations, though any such limit would have to be arbitrary.
There might be an asynchronous GnuTLS API that could be used to avoid tying up slapd while this is going on.
I will look at how some other GnuTLS servers deal with this...