Hello,
 
Since update from OpenLDAP 2.4.23 to OpenLDAP 2.4.32 about one to three times a week a slapd process crashes with a coredump.
 
Seems it’s caused by ldap requests as only some of our servers are affected which are all in the same network zone.
 
The facts I found out so far:
 
Syslog:
Mar  8 20:13:01 vg0092 slapd[220]: [ID 870088 local4.debug] get_filter: unknown filter type=48
Mar  8 20:13:01 vg0092 last message repeated 14 times
Mar  8 20:13:01 vg0092 slapd[220]: [ID 870088 local4.debug] get_filter: unknown filter type=48
Mar  8 20:13:01 vg0092 last message repeated 17 times
Mar  8 20:13:01 vg0092 slapd[220]: [ID 870088 local4.debug] get_filter: unknown filter type=48
Mar  8 20:13:01 vg0092 last message repeated 15 times
Mar  8 20:13:01 vg0092 slapd[220]: [ID 870088 local4.debug] get_filter: unknown filter type=48
Mar  8 20:13:02 vg0092 last message repeated 18 times
Mar  8 20:13:02 vg0092 slapd[220]: [ID 870088 local4.debug] get_filter: unknown filter type=48
Mar  8 20:13:11 vg0092 last message repeated 1091 times
Mar  8 20:13:11 vg0092 slapd[220]: [ID 870088 local4.debug] get_filter: unknown filter type=48
Mar  8 20:13:20 vg0092 last message repeated 1057 times
Mar  8 20:14:14 vg0092 genunix: [ID 603404 kern.notice] NOTICE: core_log: slapd[220] core dumped: /dpool/vg0092-data/ldap/core/core.slapd.220
Mar  8 20:14:14 vg0092 slapd[7288]: [ID 702911 local4.debug] @(#) $OpenLDAP: slapd 2.4.32 (Aug  5 2012 00:09:28) $
Mar  8 20:14:14 vg0092  steve@sunblade2500:/bigdisk/SOURCES/S10/openldap-2.4.32/servers/slapd
Mar  8 20:14:14 vg0092 slapd[7299]: [ID 643551 local4.debug] hdb_db_open: database "dc=scom": unclean shutdown detected; attempting recovery.
Mar  8 20:14:31 vg0092 last message repeated 2 times
Mar  8 20:14:42 vg0092 last message repeated 5 times
Mar  8 20:15:03 vg0092 slapd[8246]: [ID 702911 local4.debug] @(#) $OpenLDAP: slapd 2.4.32 (Aug  5 2012 00:09:28) $
Mar  8 20:15:03 vg0092  steve@sunblade2500:/bigdisk/SOURCES/S10/openldap-2.4.32/servers/slapd
Mar  8 20:15:03 vg0092 ldap: [ID 702911 user.warning] vg0092 slapd maintenance, rebuilding, WARNING
 
The ‘unknown filter’ messages are caused by HPUX clients. By the crash the Berkeley-DB became corrupt and has to be rebuilt.
 
Coredump:
# adb /usr/local/libexec/slapd core.slapd.220
core file = core.slapd.220 -- program ``/usr/local/libexec/slapd'' on platform SUNW,SPARC-Enterprise-T5120
SIGABRT: Abort
$c
libc.so.1`_lwp_kill+8(6, 0, fed87080, fecede54, ffffffff, 6)
libc.so.1`abort+0x110(b07ff4e8, 1, fed833f0, ffba0, fed85518, 0)
libc.so.1`_assert+0x64(12d0d0, 12c9d0, 3a8, 0, ff8bc, 19418c)
connection_next+0x138(0, b07ff7c4, b07ff7c0, 199d1c, fd17ba00, 1a2000)
0x112574(8000, b07ffcb8, 5e9bb4, 199d1c, b07ff8a8, 1c77a8)
monitor_entry_create+0x94(714ba50, b07ffcb8, 0, 545d64, b07ff8a8, 546084)
0xe1eec(714ba50, b07ffcb8, 545d3c, 0, 1, 1a2400)
monitor_back_search+0x248(714ba50, b07ffcb8, 0, 142a7da8, e1fb8, 1971d8)
fe_op_search+0x420(714ba50, b07ffcb8, 12d838, 0, 1a2928, 1a2a20)
do_search+0x618(714ba50, b07ffcb8, fed87940, 0, 3f0f4, b07ffa38)
0x3da44(b07ffe08, 714ba50, fed87940, 0, fd17ba00, 0)
0x3e3d0(0, 2f, fed87940, 0, fd17ba00, 2330ec)
libldap_r-2.4.so.2`ldap_int_thread_pool_wrapper+0x190(2330a8, b0800000, 0, 0, ff30ed80, 1)
libc.so.1`_lwp_start(0, 0, 0, 0, 0, 0)
 
pflags shows, that lwp 25 might be the culprit:
 
# pflags core.slapd.220
core 'core.slapd.220' of 220:   /usr/local/libexec/slapd -4 -u ldap -g ldap -f /dpool/vg0092-data/ldap
        data model = _ILP32  flags = MSACCT|MSFORK
/1:    flags = STOPPED  lwp_wait(0x4,0xffbffb34)
        why = PR_SUSPENDED
/2:    flags = STOPPED  pollsys(0x4,0x9f,0x0,0x0)
        why = PR_SUSPENDED
/3:    flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/4:    flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/5:    flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/6:    flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/7:    flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/8:    flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/9:    flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/10:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/11:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/12:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/13:   flags = DETACH|STOPPED
        why = PR_SUSPENDED
/14:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/15:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/16:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/17:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/18:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/19:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/20:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/21:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/22:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/23:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/24:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/25:   flags = DETACH
        sigmask = 0xffffbefc,0x0000ffff  cursig = SIGABRT
/26:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/27:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/28:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/29:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/30:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/31:   flags = DETACH|STOPPED
        why = PR_SUSPENDED
/32:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/33:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
/34:   flags = DETACH|STOPPED  lwp_park(0x4,0x0,0x0)
        why = PR_SUSPENDED
 
pstack:
-----------------  lwp# 25 / thread# 25  --------------------
fed0e8cc _lwp_kill (6, 0, fed87080, fecede54, ffffffff, 6) + 8
fec82950 abort    (b07ff4e8, 1, fed833f0, ffba0, fed85518, 0) + 110
fec82b8c _assert  (12d0d0, 12c9d0, 3a8, 0, ff8bc, 19418c) + 64
0003cc64 connection_next (0, b07ff7c4, b07ff7c0, 199d1c, fd17ba00, 1a2000) + 138
00112574 ???????? (8000, b07ffcb8, 5e9bb4, 199d1c, b07ff8a8, 1c77a8)
00114670 monitor_entry_create (714ba50, b07ffcb8, 0, 545d64, b07ff8a8, 546084) + 94
000e1eec ???????? (714ba50, b07ffcb8, 545d3c, 0, 1, 1a2400)
000e2200 monitor_back_search (714ba50, b07ffcb8, 0, 142a7da8, e1fb8, 1971d8) + 248
0004005c fe_op_search (714ba50, b07ffcb8, 12d838, 0, 1a2928, 1a2a20) + 420
0003f70c do_search (714ba50, b07ffcb8, fed87940, 0, 3f0f4, b07ffa38) + 618
0003da44 ???????? (b07ffe08, 714ba50, fed87940, 0, fd17ba00, 0)
0003e3d0 ???????? (0, 2f, fed87940, 0, fd17ba00, 2330ec)
ff30ef10 ldap_int_thread_pool_wrapper (2330a8, b0800000, 0, 0, ff30ed80, 1) + 190
fed0abd8 _lwp_start (0, 0, 0, 0, 0, 0)
 
Questions:
 
Sending the coredump is no option yet as it contains all password hashes etc.
 
Regards
 
Jürgen Sprenger
 
E-Mail:   mailto:juergen.sprenger@swisscom.com
Internet: http://www.swisscom.com/it-services