Please test RE23

List overview All Threads
Download

newer

older

TODO

connection_read_thread() return...

Quanah Gibson-Mount

13 May 2008 13 May '08

11:04 p.m.

Final anticipated fix is in for RE2.3.

Please test.

--Quanah

-- Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration

Show replies by date

Quanah Gibson-Mount

13 May 13 May

11:37 p.m.

Hold off, latest fix broke cascaded syncrepl.

--Quanah

----- "Quanah Gibson-Mount" quanah@zimbra.com wrote:

...

Final anticipated fix is in for RE2.3.

Please test.

--Quanah

-- Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc

Zimbra :: the leader in open source messaging and collaboration

-- Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration

Howard Chu

14 May 14 May

1:51 a.m.

Quanah Gibson-Mount wrote:

...

Hold off, latest fix broke cascaded syncrepl.

That's a side effect of ITS#5385. I think we should just revert the last RE23 patch; I don't have time to find the correct fix to backport for #5385.

...

--Quanah

----- "Quanah Gibson-Mount"quanah@zimbra.com wrote:

...
Final anticipated fix is in for RE2.3.

Please test.

--Quanah

-- Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc

Zimbra :: the leader in open source messaging and collaboration

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Quanah Gibson-Mount

2:49 a.m.

Reverted.

Tests w/o this patch passed some 10k times on delta-sync, and for all tests multiple times.

Shall I tag RE2.3.42?

--Quanah

----- "Howard Chu" hyc@symas.com wrote:

...

Quanah Gibson-Mount wrote:

...
Hold off, latest fix broke cascaded syncrepl.

That's a side effect of ITS#5385. I think we should just revert the last RE23 patch; I don't have time to find the correct fix to backport for #5385.

...
--Quanah

----- "Quanah Gibson-Mount"quanah@zimbra.com wrote:

...
Final anticipated fix is in for RE2.3.

Please test.

--Quanah

-- Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc

Zimbra :: the leader in open source messaging and collaboration

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

-- Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration

Gavin Henry

10:15 a.m.

All fine here.

-- Kind Regards, Gavin Henry. Managing Director. T +44 (0) 1224 279484 M +44 (0) 7930 323266 F +44 (0) 1224 824887 E ghenry@suretecsystems.com Open Source. Open Solutions(tm). http://www.suretecsystems.com/ Suretec Systems is a limited company registered in Scotland. Registered number: SC258005. Registered office: 13 Whiteley Well Place, Inverurie, Aberdeenshire, AB51 4FP.

Jonathan Clarke

11:47 p.m.

Gavin Henry wrote:

...

All fine here.

All fine here too - tested on Ubuntu / i386 and Debian / amd64.

Regards,

-- Jonathan Clarke Open Source Software Assurance (OSSA) - Groupe LINAGORA 27 rue de Berri, 75008 Paris Tél: 01 58 18 68 28, fax: 01 58 18 68 29 http://www.linagora.com - http://www.08000linux.com

Xin LI

15 May 15 May

1:58 a.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

All tests successful on FreeBSD/amd64 7.0-STABLE.

Cheers, - -- ** Help China's quake relief at http://www.redcross.org.cn/ |>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Xin LI delphij@delphij.net http://www.delphij.net/ FreeBSD - The Power to Serve!

...PGP SIGNATURE...

-----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (FreeBSD) iEYEARECAAYFAkgrfLIACgkQi+vbBBjt66CKPwCfdq7z4EkShT44tIaDEMVip7uI fg8AoLckJnGgDx8CCjbiLfVJDqdMxAkz =vv2I -----END PGP SIGNATURE-----

Aaron Richton

2:54 p.m.

Livelocked in test008. I don't think there's any mutexes held anywhere in the process, but I'm about to leave my desk and that's only to a first glance. I'll poke at it more later to verify that statement...

https://www.nbcs.rutgers.edu/~richton/openldap-re23_20080515-livelock.txt

On Tue, 13 May 2008, Quanah Gibson-Mount wrote:

...

Final anticipated fix is in for RE2.3.

Please test.

--Quanah

-- Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc

Zimbra :: the leader in open source messaging and collaboration

Howard Chu

16 May 16 May

9:21 a.m.

Aaron Richton wrote:

...

Livelocked in test008. I don't think there's any mutexes held anywhere in the process, but I'm about to leave my desk and that's only to a first glance. I'll poke at it more later to verify that statement...

https://www.nbcs.rutgers.edu/~richton/openldap-re23_20080515-livelock.txt

You've run into this sort of thing before, IIRC. At the moment no bright ideas come to mind. (That may be a combination of jetlag and beer more than anything else.)

In thread t@4, frame 3, can you print *cx and *cx->ei ? Thanks.

...

On Tue, 13 May 2008, Quanah Gibson-Mount wrote:

...
Final anticipated fix is in for RE2.3.

Please test.

--Quanah

-- Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc

Zimbra :: the leader in open source messaging and collaboration

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Aaron Richton

3:10 p.m.

...

In thread t@4, frame 3, can you print *cx and *cx->ei ? Thanks.

BTW, definitely no locks held at the moment...

*cx = { bdb = 0x30d980 op = 0x7bc6d58 ei = 0x4978b0 ids = 0xfd37f9ac tmp = 0x10a5018 buf = 0x1125018 db = 0x3668d0 dbc = 0xfd33f7ec key = { data = (nil) size = 4U ulen = 4U dlen = 0 doff = 0 flags = 32U } data = { data = (nil) size = 0 ulen = 0 dlen = 0 doff = 0 flags = 0 } dbuf = 0 id = 5U nid = 5U rc = 0 depth = 1 need_sort = '\001' prefix = '@' }

*cx->ei = { bei_parent = 0x497858 bei_id = 5U bei_lockpad = 0 bei_state = 64 bei_nrdn = { bv_len = 34U bv_val = 0x37e9b0 "ou=information technology division" } bei_rdn = { bv_len = 34U bv_val = 0x3964b8 "ou=Information Technology Division" } bei_modrdns = 0 bei_ckids = 5 bei_dkids = 5 bei_e = 0x46b5f20 bei_kids = 0xfa1b48 bei_kids_mutex = { __pthread_mutex_flags = { __pthread_mutex_flag1 = 4U __pthread_mutex_flag2 = '\0' __pthread_mutex_ceiling = '\0' __pthread_mutex_type = 0 __pthread_mutex_magic = 19800U } __pthread_mutex_lock = { __pthread_mutex_lock64 = { __pthread_mutex_pad = "" } __pthread_mutex_lock32 = { __pthread_ownerpid = 0 __pthread_lockword = 0 } __pthread_mutex_owner64 = 0 } __pthread_mutex_data = 0 } bei_lrunext = 0x2caeb80 bei_lruprev = 0x69cf90 }

Howard Chu

3:45 p.m.

Aaron Richton wrote:

...

...
In thread t@4, frame 3, can you print *cx and *cx->ei ? Thanks.

BTW, definitely no locks held at the moment...

Ok...

Looks like there may be an unsafe access of the bei_state here in dn2id.c. How long does it take to reproduce this situation? Can you try testing with this patch?

diff -u -r1.106.2.18 dn2id.c --- dn2id.c 11 Feb 2008 23:24:19 -0000 1.106.2.18 +++ dn2id.c 16 May 2008 13:45:39 -0000 @@ -1152,7 +1152,11 @@ } cx->depth--; cx->op->o_tmpfree( save, cx->op->o_tmpmemctx ); - if ( nokids ) ei->bei_state |= CACHE_ENTRY_NO_GRANDKIDS; + if ( nokids ) { + bdb_cache_entryinfo_lock( ei ); + ei->bei_state |= CACHE_ENTRY_NO_GRANDKIDS; + bdb_cache_entryinfo_unlock( ei ); + } } /* Make sure caller knows it had kids! */ cx->tmp[0]=1;

...

*cx = { bdb = 0x30d980 op = 0x7bc6d58 ei = 0x4978b0 ids = 0xfd37f9ac tmp = 0x10a5018 buf = 0x1125018 db = 0x3668d0 dbc = 0xfd33f7ec key = { data = (nil) size = 4U ulen = 4U dlen = 0 doff = 0 flags = 32U } data = { data = (nil) size = 0 ulen = 0 dlen = 0 doff = 0 flags = 0 } dbuf = 0 id = 5U nid = 5U rc = 0 depth = 1 need_sort = '\001' prefix = '@' }

*cx->ei = { bei_parent = 0x497858 bei_id = 5U bei_lockpad = 0 bei_state = 64 bei_nrdn = { bv_len = 34U bv_val = 0x37e9b0 "ou=information technology division" } bei_rdn = { bv_len = 34U bv_val = 0x3964b8 "ou=Information Technology Division" } bei_modrdns = 0 bei_ckids = 5 bei_dkids = 5 bei_e = 0x46b5f20 bei_kids = 0xfa1b48 bei_kids_mutex = { __pthread_mutex_flags = { __pthread_mutex_flag1 = 4U __pthread_mutex_flag2 = '\0' __pthread_mutex_ceiling = '\0' __pthread_mutex_type = 0 __pthread_mutex_magic = 19800U } __pthread_mutex_lock = { __pthread_mutex_lock64 = { __pthread_mutex_pad = "" } __pthread_mutex_lock32 = { __pthread_ownerpid = 0 __pthread_lockword = 0 } __pthread_mutex_owner64 = 0 } __pthread_mutex_data = 0 } bei_lrunext = 0x2caeb80 bei_lruprev = 0x69cf90 }

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Aaron Richton

4:04 p.m.

...

Looks like there may be an unsafe access of the bei_state here in dn2id.c. How long does it take to reproduce this situation? Can you try testing with this patch?

Took a bit under two days of test008 in an infinite loop. I've got that compiling right now and will start the test back up once that's done...

Aaron Richton

19 May 19 May

3:33 p.m.

Ran all weekend OK.

On Fri, 16 May 2008, Aaron Richton wrote:

...

...
Looks like there may be an unsafe access of the bei_state here in dn2id.c. How long does it take to reproduce this situation? Can you try testing with this patch?

Took a bit under two days of test008 in an infinite loop. I've got that compiling right now and will start the test back up once that's done...

6311

Age (days ago)

6317

Last active (days ago)

openldap-devel@openldap.org

12 comments

6 participants

tags (0)

participants (6)

Aaron Richton
Gavin Henry
Howard Chu
Jonathan Clarke
Quanah Gibson-Mount
Xin LI