Nick Milas wrote:
Hi,
I am running a v2.4.31 consumer on CentOS 5.8 to serve user accounts (and aliases) on a Postfix mail server running locally. It has been running for a long time without problems.
Today, after a user sent (on 14:53:39) a mass mail (through a group alias, implemented using ldap dynlist), Postfix stalled and the server (a VM under KVM) became overloaded. I noticed that openldap was using all the cpu:
# top top - 15:30:01 up 81 days, 2:11, 1 user, load average: 113.58, 114.36, 104.02 Tasks: 460 total, 3 running, 457 sleeping, 0 stopped, 0 zombie Cpu(s): 98.9%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 1.1%hi, 0.0%si, 0.0%st Mem: 3089988k total, 3074912k used, 15076k free, 12180k buffers Swap: 2064376k total, 92k used, 2064284k free, 1909976k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2209 ldap 18 0 577m 17m 8952 S 93.4 0.6 55:03.67 slapd ...
Your load average was really 113? I don't see any "threads" setting in your config. By default slapd only uses 16 threads, so by itself it could never drive the load average above 16. Something else is going quite wrong on your system.
Your database looks pretty small. But still, I see no cachesize configuration in it. That might help. Or just switch to MDB and continue to not worry about cache sizes.
database hdb suffix "dc=example,dc=com" rootdn "cn=Manager,dc=example,dc=com" rootpw secret
######## # ACLs # ######## include /usr/local/openldap/etc/openldap/acl.conf
directory /usr/local/openldap/var/openldap-data
index objectClass eq,pres index employeeType pres,eq index cn eq,pres,sub index sn,givenname eq,pres,sub index mail eq,pres,sub index uid eq,pres index ou eq,pres index mailacceptinggeneralid eq,pres index owner eq index entryCSN,entryUUID eq index vacationActive eq index associatedDomain pres,eq,sub index dc eq index emailLocalAddress eq,pres,sub
overlay dynlist dynlist-attrset nisMailAlias labeledURI dynlist-attrset groupOfURLs labeledURI member
syncrepl rid=111 provider=ldaps://ldap.example.com tls_reqcert=never type=refreshAndPersist retry="60 15 180 +" searchbase="dc=example,dc=com" schemachecking=off bindmethod=simple binddn="uid=FullReplAcc1,ou=System,dc=example,dc=com" credentials="mypassword"
database monitor
access to * by dn.exact="cn=Manager,dc=example,dc=com" read by * none
# ls -la /usr/local/openldap/var/openldap-data/ total 14120 drwxr-xr-x 2 ldap ldap 4096 Sep 28 15:33 . drwxr-xr-x 4 ldap ldap 4096 Apr 26 20:56 .. -rw-r--r-- 1 ldap ldap 4096 Sep 28 15:33 alock -rw------- 1 ldap ldap 1261568 Sep 28 15:32 associatedDomain.bdb -rw------- 1 ldap ldap 512000 Sep 28 15:32 cn.bdb -rw------- 1 ldap ldap 24576 Sep 28 15:33 __db.001 -rw------- 1 ldap ldap 1294336 Sep 28 16:12 __db.002 -rw------- 1 ldap ldap 32776192 Sep 28 16:12 __db.003 -rw------- 1 ldap ldap 3145728 Sep 28 16:11 __db.004 -rw------- 1 ldap ldap 729088 Sep 28 16:12 __db.005 -rw------- 1 ldap ldap 32768 Sep 28 16:11 __db.006 -rw-r--r-- 1 ldap ldap 924 Apr 26 21:01 DB_CONFIG -rw------- 1 ldap ldap 845 Apr 26 20:56 DB_CONFIG.example -rw------- 1 ldap ldap 61440 Sep 28 15:32 dc.bdb -rw------- 1 ldap ldap 339968 Sep 28 15:33 dn2id.bdb -rw------- 1 ldap ldap 212992 Sep 28 15:33 emailLocalAddress.bdb -rw------- 1 ldap ldap 20480 Sep 28 15:33 employeeType.bdb -rw------- 1 ldap ldap 118784 Sep 28 15:33 entryCSN.bdb -rw------- 1 ldap ldap 81920 Sep 28 15:33 entryUUID.bdb -rw------- 1 ldap ldap 90112 Sep 28 15:32 givenName.bdb -rw------- 1 ldap ldap 2457600 Sep 28 15:33 id2entry.bdb -rw------- 1 ldap ldap 24576 Jul 9 13:13 mailacceptinggeneralid.bdb -rw------- 1 ldap ldap 212992 Sep 28 15:33 mail.bdb -rw------- 1 ldap ldap 266240 Sep 28 15:33 objectClass.bdb -rw------- 1 ldap ldap 40960 Sep 28 15:33 ou.bdb -rw------- 1 ldap ldap 8192 Sep 28 15:32 owner.bdb -rw------- 1 ldap ldap 253952 Sep 28 15:32 sn.bdb -rw------- 1 ldap ldap 28672 Sep 28 15:33 uid.bdb -rw------- 1 ldap ldap 8192 Sep 25 2011 vacationActive.bdb