Full_Name: Aaron Richton Version: 2.3.41 OS: Solaris 9 URL: Submission from: (NULL) (68.192.238.168)
(dbx) where current thread: t@18 [1] __lwp_kill(0x0, 0x6, 0xffffffffffffffe6, 0x0, 0x0, 0x0), at 0xffffffff7f0a8d4c [2] raise(0x6, 0x0, 0xffffffff457fe390, 0x0, 0x0, 0x0), at 0xffffffff7f058dc0
[3] abort(0x30, 0x0, 0x30, 0x7efefeff, 0x81010100, 0xff0000), at 0xffffffff7f03e688 [4] __assert(0x10023d488, 0x10023d490, 0x8d, 0x1, 0xffffffff457fe758, 0x11b369c59), at 0xffffffff7f03e98c =>[5] attr_dup(a = 0x1174b0470), line 141 in "attr.c" [6] attrs_dup(a = 0x1174b0470), line 166 in "attr.c" [7] hdb_modify_internal(op = 0xffffffff457ff618, tid = 0x11b369b70, modlist = 0x118ea8bd0, e = 0xffffffff457feab8, text = 0xffffffff457fefd8, textbuf = 0xffffffff457feb4c "^X\xea\x8b\xd0", textlen = 256U), line 55 in "modify.c" [8] hdb_modify(op = 0xffffffff457ff618, rs = 0xffffffff457fefb8), line 480 in "modify.c" [9] syncrepl_entry(si = 0x1004b5050, op = 0xffffffff457ff618, entry = 0x118d17870, modlist = 0xffffffff457ff328, syncstate = 2, syncUUID = 0xffffffff457ff3c0, syncCookie_req = 0xffffffff457ff360, syncCSN = 0xffffffff457ff390), line 1987 in "syncrepl.c" [10] do_syncrep2(op = 0xffffffff457ff618, si = 0x1004b5050), line 735 in "syncrepl.c" [11] do_syncrepl(ctx = 0xffffffff457ffc30, arg = 0x1004b5230), line 1095 in "syncrepl.c" [12] ldap_int_thread_pool_wrapper(xpool = 0x10041e700), line 478 in "tpool.c" (dbx) list 130,150 130 131 if ( a->a_nvals != a->a_vals ) { 132 int j; 133 134 tmp->a_nvals = ch_malloc( (i + 1) * sizeof(struct berval) ); 135 for ( j = 0; !BER_BVISNULL( &a->a_nvals[j] ); j++ ) { 136 assert( j < i ); 137 ber_dupbv( &tmp->a_nvals[j], &a->a_nvals[j] ); 138 if ( BER_BVISNULL( &tmp->a_nvals[j] ) ) break; 139 /* FIXME: error? */ 140 } 141 assert( j == i ); 142 BER_BVZERO( &tmp->a_nvals[j] ); 143 144 } else { 145 tmp->a_nvals = tmp->a_vals; 146 } 147 148 } else { 149 tmp->a_vals = NULL; 150 tmp->a_nvals = NULL;
Crazy part is that I hit this on two slaves during the same replication cycle ... yet four slaves DIDN'T crash during said replication set. I think that database is hosed on those two slaves; I'll check more thoroughly tomorrow (I'm off site today). Provider is 2.3.39, so perhaps "it fed 2.3.41 garbage" -- but still, 2.3.41 should be able to gracefully handle it? Or maybe an assert() is a graceful handler, but I don't know about that, it kills my slave in the process?