"...now and then slapd just stops and always without any
traces in the logfiles. Sometime three times a day, sometime a week
without a failure. I can't find a pattern or any relation to any other
service on the linux server."
Is your LDAP environment installed on a VM or on plain metal box?
Regards, Kuba
----- Original Message -----
From: Ruud Baart
Sent: 27/02/11 12:57 PM
To: openldap-technical(a)openldap.org
Subject: Problem unexpected failing slapd
Problem: For a customer we use LDAP for many years. Last year suddenly the slapd service
just stopped without any traces in the logfiles. After a restart of slapd everything works
fine again. But the problem was there: it was not an incident, now and then slapd just
stops and always without any traces in the logfiles. Sometime three times a day, sometime
a week without a failure. I can't find a pattern or any relation to any other service
on the linux server. Environment: - Several (debian squeeze) servers , several windows
servers. We use bdb database backend. - There is one master LDAP server which provides
syncprov and two replica's LDAP servers (syncrepl). The master server is most intens
used (mainly samba as primary domain controller: a few hundred useraccounts, lot of
groupaccounts, workstations, acl's, etc.), one of the replica's is not very busy
but handles the mail for all users (lookup: amavis, postfix, courier-imap, mailaccount
settings etc). The third replica is not busy at all, it is a remote location. - Total LDAP
is 3700 dn's, slapcat produces a file of 7,3 Mb. - It is only the master LDAP with
stops suddenly. I have never seen a failure of a replica LDAP. Because I have no clear
idea about the problem I have no idea which technical details are relevant: DB_CONFIG
=========== set_cachesize 0 10485760 1 set_lk_max_objects 10000 set_lk_max_locks 10000
set_lk_max_lockers 10000 set_lg_dir /home/ldap-dbd The database is stored on a ext3
filesystem, kernel 2.6.32. The server has no problems, plenty of memory and a fast
diskarray (SAS->SATA). Never technical problems with this server. And it worked without
problems for a long period. Nothing has changed to the environment or the LDAP setup
(except of course with the upgrade to debian squeeze but the problem was already there).
What we have tried: - upgrade from openldap 2..4.17 (debian lenny+backports) to openldap
2.4.23 (debian squeeze). I saw in the release notes that problems related to syncrepl were
solved. Therefor we waited for version 2.4.23 te become available in debian. This upgrade
made no difference. - reindex, rebuilt the directory. When I rebuilt the LDAP with a clean
LDIF file on the master LDAP or an other machine with ldapadd there is not one error or
warning. The workaround for the moment: I have written a process monitor (perl daemon)
which monitors the slapd daemon and if it suddenly stops, slapd is restarted. It is of
course not a solution but the 300 user can work. If slapd stops without a restart within 1
minute a few hundred people can't work because samba stops working. I would like to
receive suggestions what we can do to find the problem. Because there is no pattern,
nothing in the logfiles I don't know where to start. -- Regards, Ruud Baart
Show replies by date