On Tue, 2009-05-26 at 04:26 -0700, Howard Chu wrote:
> abartlet(a)samba.org wrote:
>> Full_Name: Andrew Bartlett
>> Version: CVS HEAD
>> OS: Fedora 10
>> URL: ftp://ftp.openldap.org/incoming/
>> Submission from: (NULL) (22.214.171.124)
>> Samba4's provision and 'make test' seems to create some internal
>> OpenLDAP slapd where it will not accept any more connections over ldapi:///
>> This is best seen by building Samba4, and running
>> TEST_LDAP=yes OPENLDAP_ROOT=/usr/local make test
>> The slapd does not crash, but simply stops accepting new connections. Samba4
>> currently then crashes due to some other bug (the LDAP backend not responding is
>> clearly untested code in Samba4).
>> It isn't a Samba4 client bug, as ldapsearch also fails to respond.
>> This seems very, very similar to ITS#5261
> Further testing with Andrew's kvm image shows the hang only occurs when Cyrus
> SASL's libsasldb2.so plugin is present. I always remove that plugin from my
> installs, since I only use in-directory SASL secrets. That's probably why I
> wasn't seeing the reported behavior before.
> Also a note - it's still not clear we've been talking about the same thing
> to this point. Even when the samba test suite hangs, I see that ldapsearch
> still works fine against slapd. At any rate, currently all of the samba4 tests
> pass for me.
Hmm. Using that KVM image, with the libsasldb moved aside (and with it
left in place), I still get errors.
However, there is an important difference: Where previously, once it
locked up nothing proceeded, now it proceeds - as if the failure is
temporary now. The error has changed too - instead of a failure to
connect, it is an inability to successfully read the rootDSE.
I agree, we might be jumping at different shadows here, but your patches
did fix something...
I see what you're describing now, with the kvm set with 2 CPUs. It appears to
be a bug caused by the recent patch for connection_hangup() processing.
Running slapd with -d15 in your test shows that a connection is closed shortly
after being established and becoming readable. The bug is (probably) that we
queued the reader but processed the hangup immediately, thus closing the
connection before the reader executes. I'm not exactly sure why this is
causing the problem on your test, since it looks like your client is closing
the socket before waiting for the reply. But certainly this is the right area.
-- Howard Chu
CTO, Symas Corp.