Re: RE24 testing call #1 (OL 2.4.24)

11 Jan 2011


      masarati@aero.polimi.it wrote:
...
...
...
...
Both, as well as when running the head tests suite with the 2.4.23
release.  Looks as if the swamp additions have tripped into an
existing problem, not anything new.  Leave it out of RE24 until if
have been resolved?
Btw, any other Solaris test runs out there?  I´t like to know if it is
a real Solaris problem or just me..
I'm seeing a similar failure on 32 bit Sparc Solaris 10. But it actually
locks
up in test036 for me, I never get as far as test039. The gdb trace looks
much
the same as what you posted.
Looks like for some reason threads that are blocked waiting for their
sockets
to become writable are never getting waken up. A regular SIGINT shuts down
slapd cleanly so it doesn't appear to be a problem with the condvars being
used to manage the threads. That kinda points to select() simply not
returning
the writable status.
I haven't used this Solaris machine much, but in fact (looking at the
remnants
of other files in my source tree on this box) this appears to have been a
problem since at least last August. (I.e., it looks like I was
investigating
this same problem back then but dropped it and never got back to it.)
Not sure whether it is related, but I'm currently running test036 with
-DLDAP_THREAD_DEBUG (for unrelated purposes) and I see some mutex-related
failures, of the type
conn=1031 op=1 SRCH base="cn=Monitor" scope=2 deref=0
filter="(objectClass=*)"
../../../ldap-2.4-src/libraries/libldap_r/thr_debug.c:1029:
ldap_pvt_thread_mutex_unlock error: !THREAD_MUTEX_OWNER( mutex )
../../../ldap-2.4-src/libraries/libldap_r/thr_debug.c:1033:
ldap_pvt_thread_mutex_unlock error: rc is 1
I see a lot of them; they always appear within operations affecting
back-monitor, this seems to be consistent with Rein's backtrace.
uname -a
Linux fl1 2.6.34.7-0.5-desktop #1 SMP PREEMPT 2010-10-25 08:40:12 +0200
x86_64 x86_64 x86_64 GNU/Linux
Running with valgrind/helgrind, I get a hang on Linux too. Unfortunately I 
can't get a backtrace from the valgrind'd slapd. It shows a fair number of 
data races in back-meta.
There are also some lock ordering issues, but we already know about most of 
them and the code avoids deadlock using trylock() when needed. But there are a 
couple that don't, and thus are deadlock hazards. (request and abandon in 
libldap seems to be the prime offender.)
I've uploaded my testrun directory to
  http://highlandsun.com/hyc/20110111-testr.tgz
for reference. (Looks like ftp.openldap.org is full again.)
-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: RE24 testing call #1 (OL 2.4.24)