Hi,
I've had (after a long period of stability) slapd core dumping on
me, roughly once every
two weeks (per slave) they don't all go at the same time but
occasionally will go within a few
hours of each other.
To me, it looks like it SIGSEGV's in the same place as ITS5401/5, so my
question is,
Is there a patch for this I can apply to 2.4.8 or is the recommended
route to check out head and try that?
Further info below.
Cheers,
Duncan
I started with Solaris 10 and openldap 2.3.38 and bdb 4.2.52 (patched)
and moved to 2.4.7 on a clean Solaris install,
then all to 2.4.8 with bdb 4.6.21. All versions have core dumped over
the last few months.
One master, 5 slaves, one specific to a service, one test, 3 in round
robin config, all using syncrepl, refreshAndPersist,
only the busy slaves are failing, (the ones in the round robin)
Using dbx on 2.4.8, I've got core's from 2 slaves, but haven't compiled
with debugging, and
I'm not using stripped binaries, so these may be of little to no use.
That's today's job.
What little info I have so far is
From Slave 1
/export/opt/SUNWspro/bin/dbx /usr/local/libexec/slapd slapd-core-14-03-08
For information about new features see `help changes'
To remove this message, put `dbxenv suppress_startup_message 7.4' in
your .dbxrc
Reading slapd
core file header read successfully
Reading ld.so.1
Reading libldap_r-2.4.so.2.0.4
Reading liblber-2.4.so.2.0.4
Reading libltdl.so.3.1.5
Reading libdb-4.6.so
Reading librt.so.1
Reading libpthread.so.1
Reading libicuuc.so.2
Reading libicudata.so.2
Reading libsasl2.so.2.0.22
Reading libdl.so.1
Reading libssl.so.0.9.8
Reading libcrypto.so.0.9.8
Reading libresolv.so.2
Reading libgen.so.1
Reading libnsl.so.1
Reading libsocket.so.1
Reading libc.so.1
Reading libgcc_s.so.1
Reading libgcc_s.so.1
Reading libaio.so.1
Reading libmd5.so.1
Reading libm.so.2
Reading libCrun.so.1
Reading libc_psr.so.1
Reading libgssapiv2.so.2.0.22
Reading libgssapi.so.4.0.0
Reading libkrb5.so.17.4.0
Reading libasn1.so.6.1.0
Reading libroken.so.16.1.0
Reading libcom_err.so.1.1.3
Reading libncurses.so.5.4
Reading libdoor.so.1
Reading libscf.so.1
Reading libuutil.so.1
Reading libmd5_psr.so.1
Reading libmp.so.2
Reading liblogin.so.2.0.22
Reading libplain.so.2.0.22
Reading syncprov-2.4.so.2.0.4
t@1 (l@1) terminated by signal KILL (Killed)
0xfe4bd61c: __lwp_wait+0x0008: bcc,a,pt %icc,__lwp_wait+0x18 ! 0xfe4bd62c
(dbx) threads
> t@1 a l@1 ?() LWP suspended in __lwp_wait()
t@3 a l@3 ?() LWP suspended in __pollsys()
o t@4 a l@4 ldap_int_thread_pool_wrapper() signal SIGSEGV in
slap_access_allowed()
t@5 a l@5 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@6 a l@6 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@7 a l@7 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@8 a l@8 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@9 a l@9 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@10 a l@10 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@11 a l@11 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@12 a l@12 ldap_int_thread_pool_wrapper() LWP suspended in
attrs_alloc()
t@13 a l@13 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@14 a l@14 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@15 a l@15 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@16 a l@16 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
(dbx) thread t@4
Current function is ldap_int_thread_pool_wrapper
625 task->ltt_start_routine(&ctx, task->ltt_arg);
t@4 (l@4) stopped in slap_access_allowed at 0x5a234
0x0005a234: slap_access_allowed+0x10ac: ldsb [%g1 - 1], %o5
(dbx) where
current thread: t@4
[1] slap_access_allowed(0x3bb2998, 0x2d442c, 0xecf3e0ac, 0x15c800,
0x3, 0x2031f0), at 0x5a234
[2] fe_access_allowed(0x3bb2998, 0x2d442c, 0x1f81b0, 0x15a2f40, 0x4,
0x0), at 0x5bc00
[3] access_allowed_mask(0x3bb2998, 0x2d442c, 0x1f81b0, 0x15a2f40, 0x4,
0x0), at 0x57048
[4] 0x54e0c(0x3bb2998, 0x2d442c, 0x15a2f3c, 0xa3, 0x603e5c0, 0xa3), at
0x54e0b
[5] test_filter(0x3bb2998, 0x2d442c, 0x15a2f5c, 0x0, 0x1c0000,
0x1c0000), at 0x55400
[6] hdb_search(0x3bb2998, 0xecfffcb8, 0x0, 0xfff3ffd8, 0xfff3fc00,
0x163000), at 0xb0d98
[7] overlay_op_walk(0x8000, 0xecfffcb8, 0x8000, 0x15c540, 0x8000,
0xecfff838), at 0x95a7c
[8] 0x95be4(0x3bb2998, 0xecfffcb8, 0x2, 0x5f, 0x95c28, 0x1f5d60), at
0x95be3
[9] fe_op_search(0x3bb2998, 0xecfffcb8, 0x3bb2a94, 0xecfffa38,
0x163438, 0x163528), at 0x3ae68
[10] do_search(0x3bb2998, 0xecfffcb8, 0xfe4e8bc0, 0x15c800, 0x123c00,
0xecfffa38), at 0x3a5e0
[11] 0x38b20(0xecfffe08, 0x3bb2998, 0xfe4e8bc0, 0xfdec0000, 0x13564c8,
0x0), at 0x38b1f
[12] 0x393c4(0x0, 0x2a, 0xfe4e8bc0, 0xfdec0000, 0x1dad28, 0x0), at
0x393c3
=>[13] ldap_int_thread_pool_wrapper(xpool = 0x1dad18), line 625 in "tpool.c"
Pstack from slave 1
----------------- lwp# 4 / thread# 4 --------------------
0005a234 slap_access_allowed (3bb2998, 2d442c, ecf3e0ac, 15c800, 3,
2031f0) + 10ac
0005bc00 fe_access_allowed (3bb2998, 2d442c, 1f81b0, 15a2f40, 4, 0) + 54
00057048 access_allowed_mask (3bb2998, 2d442c, 1f81b0, 15a2f40, 4, 0) + 17c
00054e0c ???????? (3bb2998, 2d442c, 15a2f3c, a3, 603e5c0, a3)
00055400 test_filter (3bb2998, 2d442c, 15a2f5c, 0, 1c0000, 1c0000) + 178
000b0d98 hdb_search (3bb2998, ecfffcb8, 0, fff3ffd8, fff3fc00, 163000)
+ 2074
00095a7c overlay_op_walk (8000, ecfffcb8, 8000, 15c540, 8000, ecfff838)
+ c8
00095be4 ???????? (3bb2998, ecfffcb8, 2, 5f, 95c28, 1f5d60)
0003ae68 fe_op_search (3bb2998, ecfffcb8, 3bb2a94, ecfffa38, 163438,
163528) + 3a0
0003a5e0 do_search (3bb2998, ecfffcb8, fe4e8bc0, 15c800, 123c00,
ecfffa38) + 58c
00038b20 ???????? (ecfffe08, 3bb2998, fe4e8bc0, fdec0000, 13564c8, 0)
000393c4 ???????? (0, 2a, fe4e8bc0, fdec0000, 1dad28, 0)
ff34d89c ldap_int_thread_pool_wrapper (1dad18, ed000000, 0, 0, 0, 0) + 1ec
fe4bc400 _lwp_start (0, 0, 0, 0, 0, 0)
From Slave 2
----------------------------------------------------------------------
/export//opt/SUNWspro/bin/dbx /usr/local/libexec/slapd slapd-core-15-03-08
For information about new features see `help changes'
To remove this message, put `dbxenv suppress_startup_message 7.4' in
your .dbxrc
Reading slapd
dbx: internal warning: writable memory segment 0xed980000[2359296] of
size 0 in core
dbx: internal warning: writable memory segment 0xedc00000[262152192] of
size 0 in core
dbx: internal warning: writable memory segment 0xfd800000[5349376] of
size 0 in core
dbx: internal warning: writable memory segment 0xfdf30000[32768] of size
0 in core
dbx: internal warning: writable memory segment 0xfdf40000[483328] of
size 0 in core
dbx: internal warning: writable memory segment 0xfe370000[24576] of size
0 in core
core file header read successfully
Reading ld.so.1
Reading libldap_r-2.4.so.2.0.4
Reading liblber-2.4.so.2.0.4
Reading libltdl.so.3.1.5
Reading libdb-4.6.so
Reading librt.so.1
Reading libpthread.so.1
Reading libicuuc.so.2
Reading libicudata.so.2
Reading libsasl2.so.2.0.22
Reading libdl.so.1
Reading libssl.so.0.9.8
Reading libcrypto.so.0.9.8
Reading libresolv.so.2
Reading libgen.so.1
Reading libnsl.so.1
Reading libsocket.so.1
Reading libc.so.1
Reading libgcc_s.so.1
Reading libgcc_s.so.1
Reading libaio.so.1
Reading libmd5.so.1
Reading libm.so.2
Reading libCrun.so.1
Reading libc_psr.so.1
Reading libgssapiv2.so.2.0.22
Reading libgssapi.so.4.0.0
Reading libkrb5.so.17.4.0
Reading libasn1.so.6.1.0
Reading libroken.so.16.1.0
Reading libcom_err.so.1.1.3
Reading libncurses.so.5.4
Reading libdoor.so.1
Reading libscf.so.1
Reading libuutil.so.1
Reading libmd5_psr.so.1
Reading libmp.so.2
Reading libplain.so.2.0.22
Reading liblogin.so.2.0.22
Reading syncprov-2.4.so.2.0.4
t@1 (l@1) terminated by signal KILL (Killed)
0xfe4bd61c: __lwp_wait+0x0008: bcc,a,pt %icc,__lwp_wait+0x18 ! 0xfe4bd62c
(dbx) threads
> t@1 a l@1 ?() LWP suspended in __lwp_wait()
t@3 a l@3 ?() LWP suspended in __pollsys()
t@4 a l@4 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@5 a l@5 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@6 a l@6 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@7 a l@7 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@8 a l@8 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@9 a l@9 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@10 a l@10 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@11 a l@11 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@12 a l@12 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@13 a l@13 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@14 a l@14 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@15 a l@15 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
t@16 a l@16 ldap_int_thread_pool_wrapper() LWP suspended in
__lock_get_internal()
t@17 a l@17 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
o t@18 a l@18 ldap_int_thread_pool_wrapper() signal SIGSEGV in
match_re_C()
t@19 a l@19 ldap_int_thread_pool_wrapper() sleep on 0x1dad38
in __lwp_park()
(dbx) thread t@18
Current function is ldap_int_thread_pool_wrapper
625 task->ltt_start_routine(&ctx, task->ltt_arg);
t@18 (l@18) stopped in match_re_C at 0xfe479f60
0xfe479f60: match_re_C+0x0b50: ldub [%i1], %l6
(dbx) where
current thread: t@18
[1] match_re_C(0x1f697b, 0x0, 0x1fa278, 0xe5f3dfac, 0xe5f3df9c, 0x73),
at 0xfe479f60
[2] __regexec_C(0xfe4ea344, 0x1fa278, 0x0, 0x64, 0xe5f3e910, 0x0), at
0xfe478ecc
[3] slap_access_allowed(0x475ea18, 0x263544, 0xe5f3e06c, 0x15c800,
0x3, 0x1d76d0), at 0x59428
[4] fe_access_allowed(0x475ea18, 0x263544, 0x1d8290, 0x0, 0x5, 0x0),
at 0x5bc00
[5] access_allowed_mask(0x475ea18, 0x263544, 0x1d8290, 0x0, 0x5, 0x0),
at 0x57048
[6] slap_send_search_entry(0x8000, 0xe5fffcb8, 0xe5f3f448, 0x0, 0x5,
0x15c800), at 0x49174
[7] hdb_search(0x475ea18, 0xe5fffcb8, 0x0, 0xfff3ffd8, 0xfff3fc00,
0x163000), at 0xb0fd4
[8] overlay_op_walk(0x8000, 0xe5fffcb8, 0x8000, 0x15c540, 0x8000,
0xe5fff838), at 0x95a7c
[9] 0x95be4(0x475ea18, 0xe5fffcb8, 0x2, 0xa7, 0x95c28, 0x1f48e8), at
0x95be3
[10] fe_op_search(0x475ea18, 0xe5fffcb8, 0x475eb14, 0xe5fffa38,
0x163438, 0x163528), at 0x3ae68
[11] do_search(0x475ea18, 0xe5fffcb8, 0xfe4e8bc0, 0x15c800, 0x123c00,
0xe5fffa38), at 0x3a5e0
[12] 0x38b20(0xe5fffe08, 0x475ea18, 0xfe4e8bc0, 0xfdec3800, 0x135a360,
0x0), at 0x38b1f
[13] 0x393c4(0x0, 0x63, 0xfe4e8bc0, 0xfdec3800, 0x1dad28, 0x0), at
0x393c3
=>[14] ldap_int_thread_pool_wrapper(xpool = 0x1dad18), line 625 in "tpool.c"
Pstack of thread 18 on slave 2
----------------- lwp# 18 / thread# 18 --------------------
fe479f60 match_re_C (1f697b, 0, 1fa278, e5f3dfac, e5f3df9c, 73) + b50
fe478ecc __regexec_C (fe4ea344, 1fa278, 0, 64, e5f3e910, 0) + 16c
00059428 slap_access_allowed (475ea18, 263544, e5f3e06c, 15c800, 3,
1d76d0) + 2a0
0005bc00 fe_access_allowed (475ea18, 263544, 1d8290, 0, 5, 0) + 54
00057048 access_allowed_mask (475ea18, 263544, 1d8290, 0, 5, 0) + 17c
00049174 slap_send_search_entry (8000, e5fffcb8, e5f3f448, 0, 5,
15c800) + 158
000b0fd4 hdb_search (475ea18, e5fffcb8, 0, fff3ffd8, fff3fc00, 163000)
+ 22b0
00095a7c overlay_op_walk (8000, e5fffcb8, 8000, 15c540, 8000, e5fff838)
+ c8
00095be4 ???????? (475ea18, e5fffcb8, 2, a7, 95c28, 1f48e8)
0003ae68 fe_op_search (475ea18, e5fffcb8, 475eb14, e5fffa38, 163438,
163528) + 3a0
0003a5e0 do_search (475ea18, e5fffcb8, fe4e8bc0, 15c800, 123c00,
e5fffa38) + 58c
00038b20 ???????? (e5fffe08, 475ea18, fe4e8bc0, fdec3800, 135a360, 0)
000393c4 ???????? (0, 63, fe4e8bc0, fdec3800, 1dad28, 0)
ff34d89c ldap_int_thread_pool_wrapper (1dad18, e6000000, 0, 0, 0, 0) + 1ec
fe4bc400 _lwp_start (0, 0, 0, 0, 0, 0)