I'm running OpenLDAP 2.4.8 with Berkeley DB 2.4.6 on Solaris-10 compiled with Sun's compiler.
If I start and stop slapd in short succession it will hang after the 2nd or 3rd time. Following are the syslog message, a pstack (backtrace), and sanitized copy of slapd.conf. Any help in debugging this would be appreciated.
Thanks. Roy
Mar 11 16:00:33 master.nyc.deshaw.com slapd[3991]: [ID 702911 local4.debug] @(#) $OpenLDAP: slapd 2.4.8 (Mar 10 2008 13:35:31) $ Mar 11 16:00:33 master.nyc.deshaw.com mselby@desab005.nyc.deshaw.com:/var/tmp/build/openldap-2.4.8/servers/sla pd Mar 11 16:00:33 master.nyc.deshaw.com slapd[3991]: [ID 991073 local4.error] nss_order: ldap: not a member of group: staff Mar 11 16:00:33 master.nyc.deshaw.com slapd[3991]: [ID 610730 local4.debug] pkcs11_softtoken: Cannot create keystore. Mar 11 16:00:33 master.nyc.deshaw.com slapd[3995]: [ID 100111 local4.debug] slapd starting Mar 11 16:00:39 master.nyc.deshaw.com slapd[3995]: [ID 543694 local4.debug] daemon: shutdown requested and initiated. Mar 11 16:00:39 master.nyc.deshaw.com slapd[3995]: [ID 542995 local4.debug] slapd shutdown: waiting for 0 threads to terminate
13476: /usr/local/openldap/libexec/slapd -u ldap -g ldap ----------------- lwp# 1 / thread# 1 -------------------- fe001117 lwp_wait (2, 80476f8) fdffd326 _thrp_join (2, 0, 0, 1) + 5a fdffd4a5 pthread_join (2, 0, 80871e0, 0) + 2b 08088b02 slapd_daemon () + 7a ----------------- lwp# 2 / thread# 2 -------------------- fe00040b lwp_park (0, 0, 0) fdffac7a cond_wait_queue (83114ac, 8311494, 0, 0) + 68 fdffb146 _cond_wait (83114ac, 8311494) + 66 fdffb188 cond_wait (83114ac, 8311494) + 21 fdffb1c1 pthread_cond_wait (83114ac, 8311494, a7, 821603c) + 1b 08196c6c ldap_pvt_thread_pool_destroy (fe000542, 0, 83114cc, fdee8ec0, fdc70000, 0) + e4 083114c8 ???????? ()
include /etc/openldap/schema/core.schema include /etc/openldap/schema/cosine.schema include /etc/openldap/schema/nis.schema include /etc/openldap/schema/desco.schema include /etc/openldap/schema/sendmail.schema
pidfile /var/run/slapd/slapd.pid argsfile /var/run/slapd/slapd.args password-hash {CRYPT}
modulepath /usr/local/openldap/libexec/openldap moduleload syncprov.la moduleload rwm.la
loglevel 256 sizelimit 512000
TLSCertificateFile /etc/openldap/ssl/server.crt TLSCertificateKeyFile /etc/openldap/ssl/server.key TLSCACertificateFile /etc/openldap/ssl/ca.crt TLSVerifyClient never
access to * by dn="cn=ldapadm,dc=nyc,dc=example,dc=com" write by dn="cn=syncrepl,dc=nyc,dc=example,dc=com" write by * read
database bdb suffix "dc=nyc,dc=example,dc=com" rootdn "cn=root,dc=nyc,dc=example,dc=com" rootpw secrete directory /var/openldap/nyc.example.com/data cachesize 15000 checkpoint 512 720 index objectClass eq index entryCSN eq index entryUUID eq index ou eq index uid eq index cn eq,sub index sn eq,sub index uidNumber eq index gidNumber eq index userPassword eq index memberUid eq index ipHostNumber eq index ipNetworkNumber eq index ipProtocolNumber eq index ipServiceProtocol eq index oncRpcNumber eq index macAddress eq index ipServicePort eq index automountKey eq index amdKey eq index desmailKey eq index desmiscKey eq index sendmailMTAAliasGrouping eq index sendmailMTAMapName eq index sendmailMTAKey eq
serverID 1
database relay suffix "ou=sendmail,dc=example,dc=mail" overlay rwm suffixmassage "ou=sendmail,dc=nyc,dc=example,dc=com"
database monitor
database monitor
On Tue, 11 Mar 2008, Marantz, Roy wrote:
I'm running OpenLDAP 2.4.8 with Berkeley DB 2.4.6 on Solaris-10 compiled with Sun's compiler.
Are you sure you haven't typoed the bdb version number here? I mean, the most significant digit should be a "4", if nothing else...
Look for the slapd log message from bdb_open (-d trace) to find your Sleepycat version.
If I start and stop slapd in short succession it will hang after the 2nd or 3rd time. Following are the syslog message, a pstack (backtrace), and sanitized copy of slapd.conf. Any help in debugging this would be appreciated. Mar 11 16:00:33 master.nyc.deshaw.com slapd[3995]: [ID 100111 local4.debug] slapd starting Mar 11 16:00:39 master.nyc.deshaw.com slapd[3995]: [ID 543694 local4.debug] daemon: shutdown requested and initiated. Mar 11 16:00:39 master.nyc.deshaw.com slapd[3995]: [ID 542995 local4.debug] slapd shutdown: waiting for 0 threads to terminate
Your point is that you're hanging on a shutdown that you initiated, right? Or is it that slapd refuses to fully start and/or is exiting on its own volition shortly after startup or.....?
13476: /usr/local/openldap/libexec/slapd -u ldap -g ldap ----------------- lwp# 1 / thread# 1 -------------------- fe001117 lwp_wait (2, 80476f8) fdffd326 _thrp_join (2, 0, 0, 1) + 5a fdffd4a5 pthread_join (2, 0, 80871e0, 0) + 2b 08088b02 slapd_daemon () + 7a ----------------- lwp# 2 / thread# 2 -------------------- fe00040b lwp_park (0, 0, 0) fdffac7a cond_wait_queue (83114ac, 8311494, 0, 0) + 68 fdffb146 _cond_wait (83114ac, 8311494) + 66 fdffb188 cond_wait (83114ac, 8311494) + 21 fdffb1c1 pthread_cond_wait (83114ac, 8311494, a7, 821603c) + 1b 08196c6c ldap_pvt_thread_pool_destroy (fe000542, 0, 83114cc, fdee8ec0, fdc70000, 0) + e4 083114c8 ???????? ()
Well, easy enough, figure out lock it's waiting for ;)
I think the "shutdown you asked for" interpretation is right, so under that theory:
Seeing as you're at shutdown, my guess would be it's trying to close the bdb database in
directory /var/openldap/nyc.example.com/data
so go there, run db_stat -CA, and see what locks are still held. Assuming that you've initiated a shutdown, there really shouldn't be anything left...although there might be more going on. For example, syncrepl might still be going even if you haven't done anything "by hand." (Although your stack trace and syslogs belie that...)
But hey, why guess...turn up debugging, do you see the database being closed right?
Looks like the problem was a corrupted database, at least the problem wasn't reproducible on a know working database. BTW, I am using BDB 4.6.21. Thanks for the help. Roy
-----Original Message----- From: Aaron Richton [mailto:richton@nbcs.rutgers.edu] Sent: Tuesday, March 11, 2008 6:39 PM To: Marantz, Roy Cc: openldap-software@openldap.org Subject: Re: Help with server hang
On Tue, 11 Mar 2008, Marantz, Roy wrote:
I'm running OpenLDAP 2.4.8 with Berkeley DB 2.4.6 on Solaris-10
compiled
with Sun's compiler.
Are you sure you haven't typoed the bdb version number here? I mean, the most significant digit should be a "4", if nothing else...
Look for the slapd log message from bdb_open (-d trace) to find your Sleepycat version.
If I start and stop slapd in short succession it will hang after the
2nd
or 3rd time. Following are the syslog message, a pstack (backtrace), and sanitized copy of slapd.conf. Any help in debugging this would be appreciated. Mar 11 16:00:33 master.nyc.deshaw.com slapd[3995]: [ID 100111 local4.debug] slapd starting Mar 11 16:00:39 master.nyc.deshaw.com slapd[3995]: [ID 543694 local4.debug] daemon: shutdown requested and initiated. Mar 11 16:00:39 master.nyc.deshaw.com slapd[3995]: [ID 542995 local4.debug] slapd shutdown: waiting for 0 threads to terminate
Your point is that you're hanging on a shutdown that you initiated, right? Or is it that slapd refuses to fully start and/or is exiting on its own volition shortly after startup or.....?
13476: /usr/local/openldap/libexec/slapd -u ldap -g ldap ----------------- lwp# 1 / thread# 1 -------------------- fe001117 lwp_wait (2, 80476f8) fdffd326 _thrp_join (2, 0, 0, 1) + 5a fdffd4a5 pthread_join (2, 0, 80871e0, 0) + 2b 08088b02 slapd_daemon () + 7a ----------------- lwp# 2 / thread# 2 -------------------- fe00040b lwp_park (0, 0, 0) fdffac7a cond_wait_queue (83114ac, 8311494, 0, 0) + 68 fdffb146 _cond_wait (83114ac, 8311494) + 66 fdffb188 cond_wait (83114ac, 8311494) + 21 fdffb1c1 pthread_cond_wait (83114ac, 8311494, a7, 821603c) + 1b 08196c6c ldap_pvt_thread_pool_destroy (fe000542, 0, 83114cc, fdee8ec0, fdc70000, 0) + e4 083114c8 ???????? ()
Well, easy enough, figure out lock it's waiting for ;)
I think the "shutdown you asked for" interpretation is right, so under that theory:
Seeing as you're at shutdown, my guess would be it's trying to close the bdb database in
directory /var/openldap/nyc.example.com/data
so go there, run db_stat -CA, and see what locks are still held. Assuming that you've initiated a shutdown, there really shouldn't be anything left...although there might be more going on. For example, syncrepl might still be going even if you haven't done anything "by hand." (Although your stack trace and syslogs belie that...)
But hey, why guess...turn up debugging, do you see the database being closed right?
openldap-software@openldap.org