I have a box (Proliant DL560) with this features:
- 4GB RAM - 3 x Xeon 2.8 GHz - 2 x HD SCSI 10k RPM (No RAID)
I have the BDB in one of those HD and the logs in the other one. When I installed the server I wasn't enough sure about a correct value for cachesize so I decided to play safe.
sldcu DB_CONFIG --------------- set_cachesize 0 1073741824 0 set_lk_max_objects 2500 set_lk_max_locks 2500 set_lk_max_lockers 2500 set_lg_dir /storage/bdb-log/sldcu set_flags DB_LOG_AUTOREMOVE
accesslog DB_CONFIG ------------------- set_cachesize 0 715827883 0 set_lk_max_objects 2500 set_lk_max_locks 2500 set_lk_max_lockers 2500 set_lg_dir /storage/bdb-log/accesslog set_flags DB_LOG_AUTOREMOVE
I was reading this[1] yesterday and found out that a I have a very large value. For my dc=sld,dc=cu DB I have a 6.2 MB dn2id.bdb and a 56 MB id2entry.bdb, the DIT contains about 60k entries.
Normally I larger value is a Good Thing(TM) but I'm specially worried about slapd using around 40%-50% time of (one) CPU and eventually a message of 'deferring operation' gets logged.
The server is attending 30 connections per second sustained during working hours. Read Waiters are about 100 sustained during this same period. Yesterday I sent a message to the list asking for opinions about that value for Read Waiters because it doesn't sound good to me, but well, I know that 'sounds good' is very subjective.
I was running 2.3.38 yesterday morning and upgraded to 2.4.11. With the same setup (only the slapd changed) the mail service got deeply affected. Dovecot began to take several (+10) seconds to authenticate, in other cases, the POP3/IMAP connection was dropped due to timeout.
While Dovecot was failing my slapd.log were solid filled by 'deferring operation' messages. I stopped Dovecot, the messages dissapears, I started Dovecot, the message began again to fill logs.
So, I thank it could be some problem on Dovecot side (although it was playing nice with slapd 2.3.38).
Thanks God I hit man slapd-hdb and learn about cache tuning values there. I then raised olcDbCacheSize / olcDbDNCacheSize / olcDbIDLCacheSize and things began to calm down.
I wonder how slapd 2.3.38 were nicely serving Dovecot without the tuning I made.
On the other hand I have the suspiction, based on benchmark data published on the list, that there is something wrong with that large cache plus 50% CPU for less than 30 connections per second.
I very much appreciate any help/advise/hint someone could give me about this issue.
Regards, maykel
openldap-technical@openldap.org