Full_Name: Arvid Requate
OS: Debian Lenny
Submission from: (NULL) (22.214.171.124)
With OpenLDAP 2.4.23 and bdb 4.7.25 we seem to hit something like a race
condition that can be triggered by concurrent ldapdelete and search_s
Though a bit simmilar, this condition does no quite match the details of
ITS#5707. The URL provides a tar archive containing three gdb traces and
corresponding slapd log output (loglevel: trace args stats) of three cases of
lockup, where slapd hangs consuming 100% of CPU after a couple of modifications
with the shell script contained in the tar archive and remains unresponsive
until restartet.The number of successful operations varies between the test
Berkeley DB 4.7.25 (May 15, 2008) was built with Oracle patches for Bugs #16415
and #16541 and configure options "--enable-posixmutexes
The test machine is a single processor/single core 686 VM running Linux 2.6.32
686 bigmem. The concurrent searches are performed by a separate process that
gets informed about ldap modifications (via file) by an slapd overlay module
called 'translog'. To me the traces do not seem to indicate a problem in the
overlay code (i.e. there is no reference to the on_response function
"translog_response" in the traces).
Maybe there is some obvious point here we are missing? More debug details can be
provided if necessary.
Try again using 2.4.24. There was a bug with back-bdb delete fixed recently
(ITS#6577) so the relevant code has changed since .23.
Also try a newer BerkeleyDB. We've had other deadlocks with 4.7 that no longer
occur in 4.8.
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/