https://www.nbcs.rutgers.edu/~richton/test008deadlock-RE24_20080414.txt
Could just be ITS#5391 proving that it still exists. BDB 4.2.52. testrun and the process are still just sitting there (from 10 days ago, should've looked at the machine more often...)
Aaron Richton wrote:
https://www.nbcs.rutgers.edu/~richton/test008deadlock-RE24_20080414.txt
Could just be ITS#5391 proving that it still exists. BDB 4.2.52. testrun and the process are still just sitting there (from 10 days ago, should've looked at the machine more often...)
Afraid not, this looks like something different.
In thread t@12 frame 4, can you "print *mutex" ?
Current function is ldap_pvt_thread_mutex_lock 296 return ERRVAL( pthread_mutex_lock( mutex ) ); t@12 (l@12) stopped in __lwp_park at 0xff1654b0 0xff1654b0: __lwp_park+0x0010: ta %icc,0x00000008 (dbx) print *mutex *mutex = { __pthread_mutex_flags = { __pthread_mutex_flag1 = 4U __pthread_mutex_flag2 = '\0' __pthread_mutex_ceiling = '\0' __pthread_mutex_type = 0 __pthread_mutex_magic = 19800U } __pthread_mutex_lock = { __pthread_mutex_lock64 = { __pthread_mutex_pad = "" } __pthread_mutex_lock32 = { __pthread_ownerpid = 0 __pthread_lockword = 4278190081U } __pthread_mutex_owner64 = 4278190081ULL }
Your stack trace shows that this is due to the latest patch to back-bdb/cache.c, rev 1.175. If you revert that patch this deadlock will go away. But that's probably not the ultimate solution.
Aaron Richton wrote:
Current function is ldap_pvt_thread_mutex_lock 296 return ERRVAL( pthread_mutex_lock( mutex ) ); t@12 (l@12) stopped in __lwp_park at 0xff1654b0 0xff1654b0: __lwp_park+0x0010: ta %icc,0x00000008 (dbx) print *mutex *mutex = { __pthread_mutex_flags = { __pthread_mutex_flag1 = 4U __pthread_mutex_flag2 = '\0' __pthread_mutex_ceiling = '\0' __pthread_mutex_type = 0 __pthread_mutex_magic = 19800U } __pthread_mutex_lock = { __pthread_mutex_lock64 = { __pthread_mutex_pad = "" } __pthread_mutex_lock32 = { __pthread_ownerpid = 0 __pthread_lockword = 4278190081U } __pthread_mutex_owner64 = 4278190081ULL }
On Mon, 28 Apr 2008, Howard Chu wrote:
Your stack trace shows that this is due to the latest patch to back-bdb/cache.c, rev 1.175. If you revert that patch this deadlock will go away. But that's probably not the ultimate solution.
OK. This isn't hitting me in production (to my knowledge at least), only as "please test RE24." So I'll wait for the next "please test RE24" and give it another go...I'm sleeping until then. Thanks.
Aaron Richton wrote:
On Mon, 28 Apr 2008, Howard Chu wrote:
Your stack trace shows that this is due to the latest patch to back-bdb/cache.c, rev 1.175. If you revert that patch this deadlock will go away. But that's probably not the ultimate solution.
OK. This isn't hitting me in production (to my knowledge at least), only as "please test RE24." So I'll wait for the next "please test RE24" and give it another go...I'm sleeping until then. Thanks.
Go ahead and test the patch in HEAD, thanks.
On Tue, 29 Apr 2008, Howard Chu wrote:
Aaron Richton wrote:
On Mon, 28 Apr 2008, Howard Chu wrote:
Your stack trace shows that this is due to the latest patch to back-bdb/cache.c, rev 1.175. If you revert that patch this deadlock will go away. But that's probably not the ultimate solution.
OK. This isn't hitting me in production (to my knowledge at least), only as "please test RE24." So I'll wait for the next "please test RE24" and give it another go...I'm sleeping until then. Thanks.
Go ahead and test the patch in HEAD, thanks.
The HEAD patch causes a deadlock when bdb_cache_delete_cleanup() is called by bdb_cache_lru_purge(), as the latter already has the entryinfo locked.
Rein
Rein Tollevik wrote:
On Tue, 29 Apr 2008, Howard Chu wrote:
Go ahead and test the patch in HEAD, thanks.
The HEAD patch causes a deadlock when bdb_cache_delete_cleanup() is called by bdb_cache_lru_purge(), as the latter already has the entryinfo locked.
Thanks, now fixed in HEAD.