RE24 deadlock

List overview All Threads
Download

newer

older

How to get rid of sys_errlist and...

Re: commit:...

Aaron Richton

24 Apr 2008 24 Apr '08

11:56 a.m.

https://www.nbcs.rutgers.edu/~richton/test008deadlock-RE24_20080414.txt

Could just be ITS#5391 proving that it still exists. BDB 4.2.52. testrun and the process are still just sitting there (from 10 days ago, should've looked at the machine more often...)

Show replies by date

Howard Chu

24 Apr 24 Apr

3:18 p.m.

Aaron Richton wrote:

...

https://www.nbcs.rutgers.edu/~richton/test008deadlock-RE24_20080414.txt

Could just be ITS#5391 proving that it still exists. BDB 4.2.52. testrun and the process are still just sitting there (from 10 days ago, should've looked at the machine more often...)

Afraid not, this looks like something different.

In thread t@12 frame 4, can you "print *mutex" ?

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Aaron Richton

4:30 p.m.

Current function is ldap_pvt_thread_mutex_lock 296 return ERRVAL( pthread_mutex_lock( mutex ) ); t@12 (l@12) stopped in __lwp_park at 0xff1654b0 0xff1654b0: __lwp_park+0x0010: ta %icc,0x00000008 (dbx) print *mutex *mutex = { __pthread_mutex_flags = { __pthread_mutex_flag1 = 4U __pthread_mutex_flag2 = '\0' __pthread_mutex_ceiling = '\0' __pthread_mutex_type = 0 __pthread_mutex_magic = 19800U } __pthread_mutex_lock = { __pthread_mutex_lock64 = { __pthread_mutex_pad = "" } __pthread_mutex_lock32 = { __pthread_ownerpid = 0 __pthread_lockword = 4278190081U } __pthread_mutex_owner64 = 4278190081ULL }

Howard Chu

28 Apr 28 Apr

1:49 a.m.

Your stack trace shows that this is due to the latest patch to back-bdb/cache.c, rev 1.175. If you revert that patch this deadlock will go away. But that's probably not the ultimate solution.

Aaron Richton wrote:

...

Current function is ldap_pvt_thread_mutex_lock 296 return ERRVAL( pthread_mutex_lock( mutex ) ); t@12 (l@12) stopped in __lwp_park at 0xff1654b0 0xff1654b0: __lwp_park+0x0010: ta %icc,0x00000008 (dbx) print *mutex *mutex = { __pthread_mutex_flags = { __pthread_mutex_flag1 = 4U __pthread_mutex_flag2 = '\0' __pthread_mutex_ceiling = '\0' __pthread_mutex_type = 0 __pthread_mutex_magic = 19800U } __pthread_mutex_lock = { __pthread_mutex_lock64 = { __pthread_mutex_pad = "" } __pthread_mutex_lock32 = { __pthread_ownerpid = 0 __pthread_lockword = 4278190081U } __pthread_mutex_owner64 = 4278190081ULL }

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Aaron Richton

7:45 a.m.

On Mon, 28 Apr 2008, Howard Chu wrote:

...

Your stack trace shows that this is due to the latest patch to back-bdb/cache.c, rev 1.175. If you revert that patch this deadlock will go away. But that's probably not the ultimate solution.

OK. This isn't hitting me in production (to my knowledge at least), only as "please test RE24." So I'll wait for the next "please test RE24" and give it another go...I'm sleeping until then. Thanks.

Howard Chu

29 Apr 29 Apr

4:38 a.m.

Aaron Richton wrote:

...

On Mon, 28 Apr 2008, Howard Chu wrote:

...
Your stack trace shows that this is due to the latest patch to back-bdb/cache.c, rev 1.175. If you revert that patch this deadlock will go away. But that's probably not the ultimate solution.

OK. This isn't hitting me in production (to my knowledge at least), only as "please test RE24." So I'll wait for the next "please test RE24" and give it another go...I'm sleeping until then. Thanks.

Go ahead and test the patch in HEAD, thanks.

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Rein Tollevik

8:22 a.m.

On Tue, 29 Apr 2008, Howard Chu wrote:

...

Aaron Richton wrote:

...
On Mon, 28 Apr 2008, Howard Chu wrote:

...
Your stack trace shows that this is due to the latest patch to back-bdb/cache.c, rev 1.175. If you revert that patch this deadlock will go away. But that's probably not the ultimate solution.

OK. This isn't hitting me in production (to my knowledge at least), only as "please test RE24." So I'll wait for the next "please test RE24" and give it another go...I'm sleeping until then. Thanks.

Go ahead and test the patch in HEAD, thanks.

The HEAD patch causes a deadlock when bdb_cache_delete_cleanup() is called by bdb_cache_lru_purge(), as the latter already has the entryinfo locked.

Rein

Howard Chu

12:45 p.m.

Rein Tollevik wrote:

...

On Tue, 29 Apr 2008, Howard Chu wrote:

...

...
Go ahead and test the patch in HEAD, thanks.

The HEAD patch causes a deadlock when bdb_cache_delete_cleanup() is called by bdb_cache_lru_purge(), as the latter already has the entryinfo locked.

Thanks, now fixed in HEAD.

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

6289

Age (days ago)

6294

Last active (days ago)

openldap-devel@openldap.org

7 comments

3 participants

tags (0)

participants (3)

Aaron Richton
Howard Chu
Rein Tollevik