Re: (ITS#6856) OpenLDAP 2.4.23 / db 4.7.25 lockup 100% CPU - openldap-bugs

7 Mar 2011


      requate@univention.de wrote:
...
Full_Name: Arvid Requate
Version: 2.4.23
OS: Debian Lenny
URL: http://apt.univention.de/download/temp/openldap/trace_openldap_2.4.23_db_4.7...
Submission from: (NULL) (82.198.197.8)
With OpenLDAP 2.4.23 and bdb 4.7.25 we seem to hit something like a race
condition that can be triggered by concurrent ldapdelete and search_s
operations.
Though a bit simmilar, this condition does no quite match the details of
ITS#5707. The URL provides a tar archive containing three gdb traces and
corresponding slapd log output (loglevel: trace args stats) of three cases of
lockup, where slapd hangs consuming 100% of CPU after a couple of modifications
with the shell script contained in the tar archive and remains unresponsive
until restartet.The number of successful operations varies between the test
runs.
Berkeley DB 4.7.25 (May 15, 2008) was built with Oracle patches for Bugs #16415
and #16541 and configure options "--enable-posixmutexes
--with-mutex=POSIX/pthreads".
The test machine is a single processor/single core 686 VM running Linux 2.6.32
686 bigmem. The concurrent searches are performed by a separate process that
gets informed about ldap modifications (via file) by an slapd overlay module
called 'translog'. To me the traces do not seem to indicate a problem in the
overlay code (i.e. there is no reference to the on_response function
"translog_response" in the traces).
Maybe there is some obvious point here we are missing? More debug details can be
provided if necessary.
Try again using 2.4.24. There was a bug with back-bdb delete fixed recently 
(ITS#6577) so the relevant code has changed since .23.
Also try a newer BerkeleyDB. We've had other deadlocks with 4.7 that no longer 
occur in 4.8.
-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/