>>> Both, as well as when running the head tests suite with the 2.4.23
>>> release. Looks as if the swamp additions have tripped into an
>>> existing problem, not anything new. Leave it out of RE24 until if
>>> have been resolved?
>>> Btw, any other Solaris test runs out there? I´t like to know if it is
>>> a real Solaris problem or just me..
> I'm seeing a similar failure on 32 bit Sparc Solaris 10. But it actually
> up in test036 for me, I never get as far as test039. The gdb trace looks
> the same as what you posted.
> Looks like for some reason threads that are blocked waiting for their
> to become writable are never getting waken up. A regular SIGINT shuts down
> slapd cleanly so it doesn't appear to be a problem with the condvars being
> used to manage the threads. That kinda points to select() simply not
> the writable status.
> I haven't used this Solaris machine much, but in fact (looking at the
> of other files in my source tree on this box) this appears to have been a
> problem since at least last August. (I.e., it looks like I was
> this same problem back then but dropped it and never got back to it.)
Not sure whether it is related, but I'm currently running test036 with
-DLDAP_THREAD_DEBUG (for unrelated purposes) and I see some mutex-related
failures, of the type
conn=1031 op=1 SRCH base="cn=Monitor" scope=2 deref=0
ldap_pvt_thread_mutex_unlock error: !THREAD_MUTEX_OWNER( mutex )
ldap_pvt_thread_mutex_unlock error: rc is 1
I see a lot of them; they always appear within operations affecting
back-monitor, this seems to be consistent with Rein's backtrace.
Linux fl1 188.8.131.52-0.5-desktop #1 SMP PREEMPT 2010-10-25 08:40:12 +0200
x86_64 x86_64 x86_64 GNU/Linux
Running with valgrind/helgrind, I get a hang on Linux too. Unfortunately I
can't get a backtrace from the valgrind'd slapd. It shows a fair number of
data races in back-meta.
There are also some lock ordering issues, but we already know about most of
them and the code avoids deadlock using trylock() when needed. But there are a
couple that don't, and thus are deadlock hazards. (request and abandon in
libldap seems to be the prime offender.)
I've uploaded my testrun directory to
for reference. (Looks like ftp.openldap.org
is full again.)
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/