We're experiencing some strange issues with mirrormode; I don't have the whole picture at hand, right now, and I haven't been able to create the issue in a repeatable manner, however it happens regularly. The scenario is: 2 mirror-mode servers, each with 2 databases, plus back-config, replicated using syncrepl refreshAndPersist. We notice occasional sigsegv, and we could finally get a core dump, which contains
#0 0x080e0e6f in compare_csns (sc1=0x33a96a80, sc2=0x33a96aa0, which=0x33a9682c) at ../../../servers/slapd/syncrepl.c:665 665 for (j=0; !BER_BVISNULL( &sc2->ctxcsn[j] ); j++) { (gdb) bt full #0 0x080e0e6f in compare_csns (sc1=0x33a96a80, sc2=0x33a96aa0, which=0x33a9682c) at ../../../servers/slapd/syncrepl.c:665 i = 0 j = 0 match = 0 text = 0x80e0e1a "\201� #1 0x080e7385 in do_syncrep2 (op=0x33a96d40, si=0x919c548) at ../../../servers/slapd/syncrepl.c:990 i = Variable "i" is not available.
(gdb) p *sc1 $2 = {ctxcsn = 0x9981310, octet_str = {bv_len = 60, bv_val = 0x9d2b680 "rid=004,sid=000,csn=20070925132254.897919Z#000000#000#000000"}, rid = 4, sid = 0, numcsns = 1, sids = 0x9ce3120, sc_next = {stqe_next = 0x0}} (gdb) p *sc2 $3 = {ctxcsn = 0x0, octet_str = {bv_len = 20, bv_val = 0x9c299f0 "rid=004,sid=000,csn="}, rid = 4, sid = 0, numcsns = 0, sids = 0x0, sc_next = {stqe_next = 0x0}}
p sc1->ctxcsn[0] $4 = {bv_len = 40, bv_val = 0x9e98cf0 "20070925132254.897919Z#000000#000#000000"} (gdb) p sc1->ctxcsn[1] $5 = {bv_len = 0, bv_val = 0x0} (gdb) p sc1->sids[0] $6 = 0 (gdb) p sc1->sids[1] $7 = 159661424 (gdb) p sc2->ctxcsn[0] $8 = {bv_len = 0, bv_val = 0x0} (gdb) p sc2->sids[0] $9 = 0
Line numbers may be slightly off because of minimal customization which should not impact this functionality.
Mi first question is: is "rid=004,sid=000,csn=" a legitimate cookie? If it is, the code must account for (not) calling compare_csns() when the ctxcsn member of either struct sync_cookie is NULL. In that case, I suggest the patch
Index: servers/slapd/syncrepl.c =================================================================== RCS file: /repo/OpenLDAP/pkg/ldap/servers/slapd/syncrepl.c,v retrieving revision 1.361 diff -u -r1.361 syncrepl.c --- servers/slapd/syncrepl.c 29 Sep 2007 14:11:28 -0000 1.361 +++ servers/slapd/syncrepl.c 1 Oct 2007 12:25:05 -0000 @@ -985,7 +985,13 @@ if ( !BER_BVISNULL( &syncCookie.octet_str ) ) { slap_parse_sync_cookie( &syncCookie, NULL ); - compare_csns( &syncCookie_req, &syncCookie, &m ); + m = 0; + if ( syncCookie.ctxcsn ) { + compare_csns( &syncCookie_req, &syncCookie, &m ); + } else { + /* otherwise it would be dereferenced few lines below */ + assert( !refreshDeletes ); + } } } if ( ber_peek_tag( ber, &len ) ==
So the second question, assuming the above cookie is legitimate, would be: is it legitimate that the above cookie will only occur with refreshDelete unset?
p.
Ing. Pierangelo Masarati OpenLDAP Core Team
SysNet s.r.l. via Dossi, 8 - 27100 Pavia - ITALIA http://www.sys-net.it --------------------------------------- Office: +39 02 23998309 Mobile: +39 333 4963172 Email: pierangelo.masarati@sys-net.it ---------------------------------------