Hello,
since a short time, my slapd crashes often. I have two servers running in MM replication. I use openldap version 2.4.30 (for updates are only dedicated timeslots...) The loglevel is set to 256
I see some strange messages in my log before the slapd crashes:
"ch_realloc of 986032 bytes failed" --- "ch_malloc of 294896 bytes failed" --- "bdb(ou=root): txn_checkpoint: failed to flush the buffer cache: Cannot allocate memory" --- "ch_malloc of 34159 bytes failed"
What does they mean, how can I solve this problem
The System has 16GByte RAM, no other service is running there. The Database size is about 1500000 entires and the size of the ldif is about 2Gbyte
Because of the memory messages, I reduced the cachesize 1000000 dncachesize 1000000 idlcachesize 3000000
to cachesize 750000 dncachesize 750000 idlcachesize 2250000
but the problem exist still again.
I can't believe, that the memory is insufficient. Sysstat is running, and I see enough cache memory (about 5GByte all time), and the Swap (2GByte) is almost not used (about 2MByte).
vm.swappiness is set to default (60), so the Swap should used more before the memory is running out. OOM Kill is enabled via SYSRQ (signalling of processes), so slapd should terminated by the kernel ...
My configuration: == DB_CONFIG ================================== set_cachesize 2 0 1 set_lg_regionmax 262144 set_lg_bsize 2097152 set_flags DB_LOG_AUTOREMOVE
== slapd.conf ================================== include /etc/openldap/schema/core.schema include /etc/openldap/schema/cosine.schema include /etc/openldap/schema/inetorgperson.schema include /etc/openldap/schema/yast.schema include /etc/openldap/schema/rfc2307bis.schema
pidfile /var/run/slapd/slapd.pid argsfile /var/run/slapd/slapd.args
modulepath /usr/lib/ldap moduleload back_bdb moduleload syncprov moduleload back_monitor sizelimit -1 timelimit 300 disallow bind_anon require authc gentlehup on tool-threads 8 serverID <001|002>
database bdb suffix "ou=demo" rootdn "cn=admin"
directory /var/lib/ldap loglevel 256
cachesize 750000 dncachesize 750000 idlcachesize 2250000 cachefree 500
dirtyread dbnosync shm_key 7 checkpoint 4096 15
index objectClass,entryUUID,entryCSN eq index cn eq,sub index ... own indexes
syncrepl rid=<001|002> provider=ldap://master-<01|02> type=refreshAndPersist keepalive=360:10:5 retry="5 5 300 +" searchbase="ou=demo" attrs="*,+" overlay syncprov mirrormode TRUE syncprov-checkpoint 100 5
database monitor
==========================================
(I know, dirtyread and dbnosync are not recommended..)
Additional I see messages like: "bdb_idl_delete_key: c_del id failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995)" Should I care about it?
Thanks Meike