OpenLDAP 2.4.23 hangs when creating new group objects

List overview All Threads
Download

newer

older

Schema Design :: ACL on Groups by...

Any consideration while designing...

Mark Cave-Ayland

17 Mar 2011 17 Mar '11

8:20 a.m.

Hi all,

Having just upgraded our internal LDAP server from Debian Lenny (2.4.16 internal build) to Debian Squeeze (2.4.23), we have started to see instances where the slapd process hangs and stops responding to all requests until we kill -9 and restart the process.

Bizarrely enough, we can reproduce this pretty much every time when we try and create a new LDAP group using the GOsa web administration tool. Is this a known issue at all? Next time it happens, I'm happy to post a backtrace if you let me know what output you need from gdb to debug this.

Many thanks,

Mark.

-- Mark Cave-Ayland - Senior Technical Architect PostgreSQL - PostGIS Sirius Corporation plc - control through freedom http://www.siriusit.co.uk t: +44 870 608 0063 Sirius Labs: http://www.siriusit.co.uk/labs -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.

Show replies by date

Howard Chu

17 Mar 17 Mar

8:27 a.m.

Mark Cave-Ayland wrote:

...

Hi all,

Having just upgraded our internal LDAP server from Debian Lenny (2.4.16 internal build) to Debian Squeeze (2.4.23), we have started to see instances where the slapd process hangs and stops responding to all requests until we kill -9 and restart the process.

Bizarrely enough, we can reproduce this pretty much every time when we try and create a new LDAP group using the GOsa web administration tool. Is this a known issue at all? Next time it happens, I'm happy to post a backtrace if you let me know what output you need from gdb to debug this.

It would be more useful if you can reproduce this on 2.4.24.

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Quanah Gibson-Mount

9:10 a.m.

--On Thursday, March 17, 2011 8:27 AM -0700 Howard Chu hyc@symas.com wrote:

...

Mark Cave-Ayland wrote:

...
Hi all,

Having just upgraded our internal LDAP server from Debian Lenny (2.4.16 internal build) to Debian Squeeze (2.4.23), we have started to see instances where the slapd process hangs and stops responding to all requests until we kill -9 and restart the process.

Bizarrely enough, we can reproduce this pretty much every time when we try and create a new LDAP group using the GOsa web administration tool. Is this a known issue at all? Next time it happens, I'm happy to post a backtrace if you let me know what output you need from gdb to debug this.

It would be more useful if you can reproduce this on 2.4.24.

Debian's squeeze build of OpenLDAP also contains a patch known to corrupt the database. The first thing you want to do is abandon their build.

--Quanah

Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration

Mark Cave-Ayland

9:32 a.m.

On 17/03/11 16:10, Quanah Gibson-Mount wrote:

...

Debian's squeeze build of OpenLDAP also contains a patch known to corrupt the database. The first thing you want to do is abandon their build.

Really? Ugh. Thanks for the heads up - has anyone reported this upstream to Debian yet?

ATB,

Mark.

Quanah Gibson-Mount

9:41 a.m.

--On Thursday, March 17, 2011 4:32 PM +0000 Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:

...

On 17/03/11 16:10, Quanah Gibson-Mount wrote:

...
Debian's squeeze build of OpenLDAP also contains a patch known to corrupt the database. The first thing you want to do is abandon their build.

Really? Ugh. Thanks for the heads up - has anyone reported this upstream to Debian yet?

Yes, I reported it to them on 2/28. Canonical fixed the build within 4 hours. Debian's been absolutely silent.

--Quanah

Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration

Mark Cave-Ayland

9:31 a.m.

On 17/03/11 15:27, Howard Chu wrote:

...

...
Bizarrely enough, we can reproduce this pretty much every time when we try and create a new LDAP group using the GOsa web administration tool. Is this a known issue at all? Next time it happens, I'm happy to post a backtrace if you let me know what output you need from gdb to debug this.

It would be more useful if you can reproduce this on 2.4.24.

Okay. In the meantime, I've just setup a development environment for testing and got the following backtrace from the hung process using gdb:

(gdb) bt full #0 0x00007fa50aca8be5 in pthread_join (threadid=140346751547136, thread_return=0x0) at pthread_join.c:89 __ignore = <value optimized out> _tid = 10340 _buffer = {__routine = 0x7fa50aca8ab0 <cleanup>, __arg = 0x7fa506457d28, __canceltype = 105216464, __prev = 0x0} oldtype = 0 result = <value optimized out> #1 0x000000000042d72c in slapd_daemon () at /home/devel/openldap/trunk/servers/slapd/daemon.c:2842 listener_tid = 140346751547136 rc = 0 #2 0x000000000041ae6a in main (argc=9, argv=0x7fffd2f2e5b0) at /home/devel/openldap/trunk/servers/slapd/main.c:961 i = 9 no_detach = 0 rc = -12 urls = 0x7df0c0 "ldap:/// ldapi:///" username = 0x7df100 "root" groupname = 0x7df0e0 "ldap" sandbox = 0x0 syslogUser = 160 configfile = 0x7df120 "/etc/ldap/slapd.conf" configdir = 0x0 serverName = <value optimized out> scp = <value optimized out> scp_entry = <value optimized out> debug_unknowns = 0x0 syslog_unknowns = 0x0 slapd_pid_file_unlink = 1 slapd_args_file_unlink = 1 firstopt = <value optimized out> __PRETTY_FUNCTION__ = "main" (gdb)

Maybe not entirely helpful, but now the test environment is set up, I'll have a go with a source build of 2.4.24 with full debug enabled and see if it is still reproducible there.

ATB,

Mark.

Mark Cave-Ayland

10:24 a.m.

On 17/03/11 15:27, Howard Chu wrote:

...

It would be more useful if you can reproduce this on 2.4.24.

Okay - I've just completed two builds from vanilla source, one for 2.4.23 and another for 2.4.24. Under 2.4.23, I see exactly the same crash in pthread_join() and I have to kill -9 the slapd process. Fortunately the 2.4.24 build seems to work fine and doesn't exhibit the problem.

Based upon the fact it seems like a pthread/locking issue, do you have an ITS reference I can chase with upstream Debian? This is an absolute showstopper IMO as we're seeing multiple hard crashes a day even on our local, minimally loaded LDAP server running 2.4.23.

ATB,

Mark.

5231

Age (days ago)

5231

Last active (days ago)

openldap-technical@openldap.org

6 comments

3 participants

tags (0)

participants (3)

Howard Chu
Mark Cave-Ayland
Quanah Gibson-Mount