Hi,
Can anyone shed light on why my slapadd process on Solaris 11 was hanging (slapd 2.4.30)? I was piping 'gunzip -c' through to slapadd (from an overnight cron job). This usually works fine. I'm guessing perhaps the gunzip process encountered an error, maybe an EPIPE was sent, but the stack trace many days later from slapadd doesn't look right. Only 2 threads, and both of them in ldap_pvt_thread_cond_wait(), but no other threads so what are they waiting for exactly?
# pstack /var/tmp/cores/core.slapd.0.16189.1463492837
core '/var/tmp/cores/core.slapd.0.16189.1463492837' of 16189: /usr/sbin/slapadd -q -f /etc/openldap/slapd.conf
------------ lwp# 1 / thread# 1 ---------------
07ff7af9 lwp_park (0, 0, 0)
07ff1193 cond_wait_queue (82ed048, 82ed020, 0, 7ff16b0) + 63
07ff1728 __cond_wait (82ed048, 82ed020, f5a685c8, 7ff1771) + 89
07ff177f cond_wait (82ed048, 82ed020, f5a685f8, 7ff17b7) + 27
07ff17cc pthread_cond_wait (82ed048, 82ed020, f5a68618, 7e56137) + 24
07e5614c ldap_pvt_thread_cond_wait (82ed048, 82ed020, f5a68648, 81b7e29) + 24
081b7e7a bdb_tool_entry_close (8e9d398, ffffffff, f5a68698, 82ecc38) + 62
081a38ec slapadd (4, f5a68bbc, f5a68bbc, 80dccb1) + cb8
080dcd2c main (4, f5a68bbc, f5a68bd0, ef60f968) + ac
080dc86d _start (4, f5a68c8e, f5a68ca0, f5a68ca3, f5a68ca6, 0) + 7d
------------ lwp# 2 / thread# 2 ---------------
07ff7af9 lwp_park (0, 0, 0)
07ff1193 cond_wait_queue (8e6b0e4, 8e6b0cc, 0, 7ff16b0) + 63
07ff1728 __cond_wait (8e6b0e4, 8e6b0cc, d7dffdb8, 7ff1771) + 89
07ff177f cond_wait (8e6b0e4, 8e6b0cc, 8e6b0c8, 7ff17b7) + 27
07ff17cc pthread_cond_wait (8e6b0e4, 8e6b0cc, 0, 7e56137) + 24
07e5614c ldap_pvt_thread_cond_wait (8e6b0e4, 8e6b0cc, d7dfffc8, 7e54bfc) + 24
07e54d66 ldap_int_thread_pool_wrapper (8e6b0c8, 803a000, d7dfffe8, 7ff7769) + 242
07ff77bc _thrp_setup (ee1e0240) + 9d
07ff7aa0 _lwp_start (ee1e0240, 0, 0, 0, 0, 0)
Thanks, Mark.
________________________________
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies; do not disclose, use or act upon the information; and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.
Bannister, Mark wrote:
Hi,
Can anyone shed light on why my slapadd process on Solaris 11 was hanging (slapd 2.4.30)? I was piping ‘gunzip -c’ through to slapadd (from an overnight cron job). This usually works fine. I’m guessing perhaps the gunzip process encountered an error, maybe an EPIPE was sent, but the stack trace many days later from slapadd doesn’t look right. Only 2 threads, and both of them in ldap_pvt_thread_cond_wait(), but no other threads so what are they waiting for exactly?
You'd need a backtrace with debug symbols to know exactly.
Your lwp #2 is in the thread pool, probably waiting for lwp #1 to finish. Why lwp #1 is still waiting, dunno.
# pstack /var/tmp/cores/core.slapd.0.16189.1463492837
core '/var/tmp/cores/core.slapd.0.16189.1463492837' of 16189: /usr/sbin/slapadd -q -f /etc/openldap/slapd.conf
------------ lwp# 1 / thread# 1 ---------------
07ff7af9 lwp_park (0, 0, 0)
07ff1193 cond_wait_queue (82ed048, 82ed020, 0, 7ff16b0) + 63
07ff1728 __cond_wait (82ed048, 82ed020, f5a685c8, 7ff1771) + 89
07ff177f cond_wait (82ed048, 82ed020, f5a685f8, 7ff17b7) + 27
07ff17cc pthread_cond_wait (82ed048, 82ed020, f5a68618, 7e56137) + 24
07e5614c ldap_pvt_thread_cond_wait (82ed048, 82ed020, f5a68648, 81b7e29) + 24
081b7e7a bdb_tool_entry_close (8e9d398, ffffffff, f5a68698, 82ecc38) + 62
081a38ec slapadd (4, f5a68bbc, f5a68bbc, 80dccb1) + cb8
080dcd2c main (4, f5a68bbc, f5a68bd0, ef60f968) + ac
080dc86d _start (4, f5a68c8e, f5a68ca0, f5a68ca3, f5a68ca6, 0) + 7d
------------ lwp# 2 / thread# 2 ---------------
07ff7af9 lwp_park (0, 0, 0)
07ff1193 cond_wait_queue (8e6b0e4, 8e6b0cc, 0, 7ff16b0) + 63
07ff1728 __cond_wait (8e6b0e4, 8e6b0cc, d7dffdb8, 7ff1771) + 89
07ff177f cond_wait (8e6b0e4, 8e6b0cc, 8e6b0c8, 7ff17b7) + 27
07ff17cc pthread_cond_wait (8e6b0e4, 8e6b0cc, 0, 7e56137) + 24
07e5614c ldap_pvt_thread_cond_wait (8e6b0e4, 8e6b0cc, d7dfffc8, 7e54bfc) + 24
07e54d66 ldap_int_thread_pool_wrapper (8e6b0c8, 803a000, d7dfffe8, 7ff7769) + 242
07ff77bc _thrp_setup (ee1e0240) + 9d
07ff7aa0 _lwp_start (ee1e0240, 0, 0, 0, 0, 0)
Thanks,
Mark.
Howard Chu wrote:
Your lwp #2 is in the thread pool, probably waiting for lwp #1 to finish. Why lwp #1 is still waiting, dunno.
Yes that was all I could conclude as well. As to why lwp #1 is still waiting, well it's in bdb_tool_entry_close(), but without the debug symbols it's not easy to work out whether it's waiting on the trickle condition or the index condition. What did seem odd is why it was blocked on a condition at all when there was no other active thread capable of unblocking it. A race condition in the code somewhere, I guess.
I've opened a support case with Oracle, as it's their bundled version of OpenLDAP. Hopefully they can dig further as they will have the debug symbols. Unfortunately I don't. They might follow up on this thread in the future if they have questions.
Thanks, Mark.
--------------------------------------------------------------------------------
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies; do not disclose, use or act upon the information; and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.
> Howard Chu wrote:
Your lwp #2 is in the thread pool, probably waiting for lwp #1 to finish. Why lwp #1 is still waiting, dunno.
Yes that was all I could conclude as well. As to why lwp #1 is still waiting, well it's in bdb_tool_entry_close(), but without the debug symbols it's not easy to work out whether it's waiting on the trickle condition or the index condition. What did seem odd is why it was blocked on a condition at all when there was no other active thread capable of unblocking it. A race condition in the code somewhere, I guess.
Call me impatient, but I've dived into disassembly now rather than waiting on Oracle who have the debug symbols.
::walk thread | ::findstack -v
stack pointer for thread 1: f5a68538 [ f5a68538 libc_hwcap1.so.1`__lwp_park+0x19() ] f5a68568 libc_hwcap1.so.1`cond_wait_queue+0x63(82ed048, 82ed020, 0, 7ff16b0) f5a685a8 libc_hwcap1.so.1`__cond_wait+0x89(82ed048, 82ed020, f5a685c8, 7ff1771) f5a685c8 libc_hwcap1.so.1`cond_wait+0x27(82ed048, 82ed020, f5a685f8, 7ff17b7) f5a685e8 libc_hwcap1.so.1`pthread_cond_wait+0x24(82ed048, 82ed020, f5a68618, 7e56137) f5a68608 libldap_r-2.4.so.2.8.3`ldap_pvt_thread_cond_wait+0x24(82ed048, 82ed020, f5a68648, 81b7e29) f5a68648 bdb_tool_entry_close+0x62(8e9d398, ffffffff, f5a68698, 82ecc38) f5a68ae8 slapadd+0xcb8(4, f5a68bbc, f5a68bbc, 80dccb1) f5a68b88 main+0xac(4, f5a68bbc, f5a68bd0, ef60f968) f5a68bb0 _start+0x7d(4, f5a68c8e, f5a68ca0, f5a68ca3, f5a68ca6, 0)
The offending ldap_pvt_thread_cond_wait(), which will never be unblocked because no other thread will do it, is in bdb_tool_entry_close+0x62. Checking the disassembly:
bdb_tool_entry_close::dis
bdb_tool_entry_close: pushl %ebp bdb_tool_entry_close+1: movl %esp,%ebp bdb_tool_entry_close+3: pushl %ebx bdb_tool_entry_close+4: pushl %esi bdb_tool_entry_close+5: pushl %edi bdb_tool_entry_close+6: subl $0x1c,%esp bdb_tool_entry_close+9: andl $0xfffffff0,%esp bdb_tool_entry_close+0xc: call +0x0 <bdb_tool_entry_close+0x11> bdb_tool_entry_close+0x11: popl %ebx bdb_tool_entry_close+0x12: addl $0x1171d7,%ebx bdb_tool_entry_close+0x18: cmpl $0x0,0x1dfa4(%ebx) bdb_tool_entry_close+0x1f: je +0x17e <bdb_tool_entry_close+0x1a3>
... we were at 0x62, so we didn't jump to 0x17e, checking source code:
int bdb_tool_entry_close( BackendDB *be ) { if ( bdb_tool_info ) {
... so bdb_tool_info was not a NULL pointer.
bdb_tool_entry_close+0x25: movl 0x2dc(%ebx),%eax bdb_tool_entry_close+0x2b: movl $0x1,(%eax) bdb_tool_entry_close+0x31: subl $0xc,%esp bdb_tool_entry_close+0x34: leal 0x1e020(%ebx),%eax bdb_tool_entry_close+0x3a: pushl %eax bdb_tool_entry_close+0x3b: call -0xdcac0 <PLT=libldap_r-2.4.so.2.8.3`ldap_pvt_thread_mutex_lock> bdb_tool_entry_close+0x40: addl $0x10,%esp bdb_tool_entry_close+0x43: cmpl $0x0,0x1dfb8(%ebx) bdb_tool_entry_close+0x4a: jne +0x22 <bdb_tool_entry_close+0x6e> bdb_tool_entry_close+0x4c: leal 0x1e048(%ebx),%esi bdb_tool_entry_close+0x52: leal 0x1e020(%ebx),%edi bdb_tool_entry_close+0x58: subl $0x8,%esp bdb_tool_entry_close+0x5b: pushl %edi bdb_tool_entry_close+0x5c: pushl %esi bdb_tool_entry_close+0x5d: call -0xdc882 <PLT=libldap_r-2.4.so.2.8.3`ldap_pvt_thread_cond_wait> bdb_tool_entry_close+0x62: addl $0x10,%esp
This section of disassembly seems to match up quite nicely with the next section of code:
slapd_shutdown = 1; #ifdef USE_TRICKLE ldap_pvt_thread_mutex_lock( &bdb_tool_trickle_mutex );
/* trickle thread may not have started yet */ while ( !bdb_tool_trickle_active ) ldap_pvt_thread_cond_wait( &bdb_tool_trickle_cond_end, &bdb_tool_trickle_mutex );
So bdb_tool_trickle_active is 0, and we are blocking on the condition variable bdb_tool_trickle_cond_end. The corresponding call to ldap_pvt_thread_cond_signal occurs in bdb_tool_trickle_task(). The trickle task function looks like this:
ldap_pvt_thread_mutex_lock( &bdb_tool_trickle_mutex ); bdb_tool_trickle_active = 1; ldap_pvt_thread_cond_signal( &bdb_tool_trickle_cond_end ); while ( 1 ) { ldap_pvt_thread_cond_wait( &bdb_tool_trickle_cond, &bdb_tool_trickle_mutex ); if ( slapd_shutdown ) break; env->memp_trickle( env, 30, &wrote ); } bdb_tool_trickle_active = 0; ldap_pvt_thread_cond_signal( &bdb_tool_trickle_cond_end ); ldap_pvt_thread_mutex_unlock( &bdb_tool_trickle_mutex );
The cond_signal calls are all contained within one big mutex_lock. So could our thread have reached its mutex_lock *after* the very last invocation of the trickle task function, then blocked on its mutex_lock until the trickle task function returned, allowing our thread to reach a cond_wait which will never get a signal? There seems to be a race condition here.
The trickle task is a thread in the pool, submitted by bdb_tool_entry_open():
ldap_pvt_thread_pool_submit( &connection_pool, bdb_tool_trickle_task, bdb->bi_dbenv );
... except that the thread wasn't in our core dump. So slapd_shutdown will have been non-zero.
Not being that familiar with the code, I'll stop here for now to see if anyone who is more familiar with it might now have an explanation as to how slapadd could have got itself into this state.
Thanks, Mark.
--------------------------------------------------------------------------------
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies; do not disclose, use or act upon the information; and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.
> The cond_signal calls are all contained within one big mutex_lock.
Sorry, I missed the fact that ldap_pvt_thread_cond_wait() inside bdb_tool_trickle_task() will unlock the bdb_tool_trickle_mutex.
But then, I think the author of the trickle task function has also forgotten this, because later on:
bdb_tool_trickle_active = 0; ldap_pvt_thread_cond_signal( &bdb_tool_trickle_cond_end ); ldap_pvt_thread_mutex_unlock( &bdb_tool_trickle_mutex );
Why even try to unlock the mutex when the earlier call to ldap_pvt_thread_cond_wait() will have done that already? Perhaps the intention was for the ldap_pvt_thread_cond_signal() to be called when the mutex was locked?
Actually, that might explain things, because the previous line:
bdb_tool_trickle_active = 0;
... is being called without holding the mutex, whereas bdb_tool_entry_close() is testing for !bdb_tool_trickle_active after acquiring the mutex. I think the fix is probably to do another mutex_lock() inside bdb_tool_trickle_task() prior to setting bdb_tool_trickle_active to 0. Like this:
--- servers/slapd/back-bdb/tools.c.orig 2012-02-29 17:37:09.000000000 +0000 +++ servers/slapd/back-bdb/tools.c 2016-05-27 12:35:31.285728000 +0100 @@ -1283,6 +1283,7 @@ break; env->memp_trickle( env, 30, &wrote ); } + ldap_pvt_thread_mutex_lock( &bdb_tool_trickle_mutex ); bdb_tool_trickle_active = 0; ldap_pvt_thread_cond_signal( &bdb_tool_trickle_cond_end ); ldap_pvt_thread_mutex_unlock( &bdb_tool_trickle_mutex );
--------------------------------------------------------------------------------
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies; do not disclose, use or act upon the information; and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.
> Sorry, I missed the fact that ldap_pvt_thread_cond_wait() inside bdb_tool_trickle_task() will unlock the bdb_tool_trickle_mutex.
... and now a colleague has pointed out, quite correctly, that cond_wait() reacquires the mutex before it returns.
So please disregard my suggested patch, which is wrong.
Will have to do some more thinking. There is a race condition here somewhere, just not sure where it is yet.
--------------------------------------------------------------------------------
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies; do not disclose, use or act upon the information; and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.
> Will have to do some more thinking. There is a race condition here somewhere, just not sure where it is yet.
The race seems to be here, in bdb_tool_entry_close(), introduced by commit 31175da3 while fixing ITS#6853:
slapd_shutdown = 1; #ifdef USE_TRICKLE ldap_pvt_thread_mutex_lock( &bdb_tool_trickle_mutex );
/* trickle thread may not have started yet */ while ( !bdb_tool_trickle_active ) ldap_pvt_thread_cond_wait( &bdb_tool_trickle_cond_end, &bdb_tool_trickle_mutex );
The comment says "trickle thread may not have started yet", but doesn't seem to account for the fact that the trickle thread may already have finished. This is primarily because this function is not the only one that can cond_signal on bdb_tool_trickle_cond, here's another one in bdb_tool_entry_put():
#ifdef USE_TRICKLE if (( slapMode & SLAP_TOOL_QUICK ) && (( e->e_id & 0xfff ) == 0xfff )) { ldap_pvt_thread_cond_signal( &bdb_tool_trickle_cond ); } #endif
... and we were running slapadd with the -q option. I think the chain of events that can lead to the situation I saw is:
bdb_tool_entry_close(): slapd_shutdown = 1; bdb_tool_entry_put(): ldap_pvt_thread_cond_signal( &bdb_tool_trickle_cond ); bdb_tool_trickle_task(): bdb_tool_trickle_active = 0; bdb_tool_entry_close(): ldap_pvt_thread_mutex_lock( &bdb_tool_trickle_mutex ); bdb_tool_trickle_task(): ldap_pvt_thread_cond_signal( &bdb_tool_trickle_cond_end ); bdb_tool_trickle_task(): ldap_pvt_thread_mutex_unlock( &bdb_tool_trickle_mutex ); bdb_tool_entry_close(): while ( !bdb_tool_trickle_active ) bdb_tool_entry_close(): ldap_pvt_thread_cond_wait( &bdb_tool_trickle_cond_end, &bdb_tool_trickle_mutex );
So perhaps we need bdb_tool_trickle_active to additionally indicate whether in fact bdb_tool_trickle_task() has finished already. How about this?
--- servers/slapd/back-bdb/tools.c.orig 2012-02-29 17:37:09.000000000 +0000 +++ servers/slapd/back-bdb/tools.c 2016-05-27 13:10:31.453451000 +0100 @@ -174,7 +174,7 @@ &bdb_tool_trickle_mutex );
ldap_pvt_thread_cond_signal( &bdb_tool_trickle_cond ); - while ( bdb_tool_trickle_active ) + while ( bdb_tool_trickle_active > 0 ) ldap_pvt_thread_cond_wait( &bdb_tool_trickle_cond_end, &bdb_tool_trickle_mutex ); ldap_pvt_thread_mutex_unlock( &bdb_tool_trickle_mutex ); @@ -1283,7 +1283,7 @@ break; env->memp_trickle( env, 30, &wrote ); } - bdb_tool_trickle_active = 0; + bdb_tool_trickle_active = -1; ldap_pvt_thread_cond_signal( &bdb_tool_trickle_cond_end ); ldap_pvt_thread_mutex_unlock( &bdb_tool_trickle_mutex );
--------------------------------------------------------------------------------
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies; do not disclose, use or act upon the information; and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.
... or you could lock the bdb_tool_trickle_mutex prior to setting slapd_shutdown = 1 in bdb_tool_entry_close().
--------------------------------------------------------------------------------
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies; do not disclose, use or act upon the information; and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.
openldap-technical@openldap.org