Hi ,
I did quite a bit reading and research before I send email to this list for help. If I have missed some basic concepts here, please execuse my ignorance. Thanks for your help and time in advance.
1. Summary
The initial search in my prototyping with OpenLDAP (slapd + BDB) seemed to be slow. What is the reason and How could I fix it?
2. Configuration
2.1 Environment
Linux CentOS, 1 hard disk (therefore unfortunately the BDB transaction logs and database files are written to the same disk), 120GB disk space (80% unused), 1GB RAM, reserved for this prototyping, OpenLDAP 2.3.39 with default BDB installation
2.2 slapd.conf (modified trivially for discussion purpose)
# global configuration
loglevel 0
# BDB
database bdb
suffix "dc=test,dc=dummy,dc=com"
rootdn "cn=Manager,dc=test,dc=dummy,dc=com"
# Cleartext passwords, especially for the rootdn, should
# be avoid. See slappasswd(8) and slapd.conf(5) for details.
# Use of strong authentication encouraged.
rootpw secret
# The database directory MUST exist prior to running slapd AND
# should only be accessible by the slapd and slap tools.
# Mode 700 recommended.
directory /usr/local/var/openldap-data
#Other DB configuration
idlcachesize 60000
cachesize 20000
# Indices to maintain
# the indexes are to support search in first name, last name and email for both exact match and wild cards in the end
index objectClass eq
index gn pres,eq,sub
index sn pres,eq,sub
index mail pres,eq,sub
2.3 DB_CONFIG (for BDB)
set_cachesize 0 52428800 1
set_lg_bsize 2097512
set_flags DB_LOG_AUTOREMOVE
set_lg_regionmax 262144
2.4 Data setup
2 million records (users with gn, sn, email, mobile, street address, etc. in the BDB; all records are indexed using the index in the above slapd.conf; grouped by the first character of lastName. For example,
dn: ou=Z,dc=test,dc=dummy,dc=com
objectclass: organizationalUnit
ou: Z
Sample LDIF entry:
#Directory Entry
dn: uid=ABCDEFGHIJKLMNOPQRSTUVWXYZ123459,ou=F,dc=test,dc=dummy,dc=com
objectclass: top
objectclass: person
objectclass: organizationalPerson
objectclass: inetOrgPerson
uid: ABCDEFGHIJKLMNOPQRSTUVWXYZ123459
...... (details omitted)
3. Symptom/Problem
It was very slow in the first (fresh) search if I searched by wildcard firstname only like "Larry*" (which returned 478 entries/users). The response time was generally higher than 5 seconds Depending the count of records found, the response time might exceed 20 or even 50 seconds. During the search, the "iostat" result showed +95% %iowait, await was much higher that svctm, the device %util was over 96%. Here is the "iostat" output:
Time: 10:51:34 AM
avg-cpu: %user %nice %sys %iowait %idle
3.10 0.00 1.40 95.50 0.00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
hda 0.00 2.90 64.94 65.33 1322.68 580.22 661.34 290.11 14.61 51.99 343.51 7.44 96.92
dm-0 0.00 0.00 64.94 72.53 1322.68 580.22 661.34 290.11 13.84 55.44 330.62 7.06 96.99
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
However, The subsequent search (using the exact search criteria) is much faster (within 200ms). I believe it is because of the cache.
I did a "db_stat -m" check and saw +90% cache hit rate (I guess it is normal?). The detailed output is in the attachment.
4. Questions
The "iostat" output showed obvious I/O bottleneck. Assuming I can't upgrade my hardware (for example, adding another disk specifically for writing transaction logs to), assuming I won't set a limit to the max number of entried returned, is there anything else I can do (typically BDB/slapd tuning or configuration) to make the fresh/first search much faster (say within 2 seconds for the worst case)? Did I do anything wrong? Please advise.
Thanks a lot!
Vic