Andrew Bartlett wrote:
On Tue, 2009-05-26 at 04:26 -0700, Howard Chu wrote:
abartlet@samba.org wrote:
Full_Name: Andrew Bartlett Version: CVS HEAD OS: Fedora 10 URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (59.167.251.137)
Samba4's provision and 'make test' seems to create some internal situation in OpenLDAP slapd where it will not accept any more connections over ldapi:///
This is best seen by building Samba4, and running
TEST_LDAP=yes OPENLDAP_ROOT=/usr/local make test
The slapd does not crash, but simply stops accepting new connections. Samba4 currently then crashes due to some other bug (the LDAP backend not responding is clearly untested code in Samba4).
It isn't a Samba4 client bug, as ldapsearch also fails to respond.
This seems very, very similar to ITS#5261
Further testing with Andrew's kvm image shows the hang only occurs when Cyrus SASL's libsasldb2.so plugin is present. I always remove that plugin from my installs, since I only use in-directory SASL secrets. That's probably why I wasn't seeing the reported behavior before.
Also a note - it's still not clear we've been talking about the same thing up to this point. Even when the samba test suite hangs, I see that ldapsearch still works fine against slapd. At any rate, currently all of the samba4 tests pass for me.
Hmm. Using that KVM image, with the libsasldb moved aside (and with it left in place), I still get errors.
However, there is an important difference: Where previously, once it locked up nothing proceeded, now it proceeds - as if the failure is temporary now. The error has changed too - instead of a failure to connect, it is an inability to successfully read the rootDSE.
I agree, we might be jumping at different shadows here, but your patches did fix something...
I see what you're describing now, with the kvm set with 2 CPUs. It appears to be a bug caused by the recent patch for connection_hangup() processing. Running slapd with -d15 in your test shows that a connection is closed shortly after being established and becoming readable. The bug is (probably) that we queued the reader but processed the hangup immediately, thus closing the connection before the reader executes. I'm not exactly sure why this is causing the problem on your test, since it looks like your client is closing the socket before waiting for the reply. But certainly this is the right area.