Hi Dieter,

Many thanks for your response. I increased the cache size by 10 times in both slapd.conf and DB_CONFIG. The performance got improved, but not impressive. For example,

Searching by first name only using "scott*" only (for wildcard) for the 1st time still took 12937 ms to find 881 users (BTW, I was using SpringLDAP  Java client). My other responses are inline.


I wanted to consider scaling factor. If we use OpenLDAP in production, we will have over 350million user entries. I probably don't want to put so many entries in cache unless it is really needed. That is why I used a relatively small cache in my prorotyping (in the hope it can scale up).



Did I have a flaw in organizing/grouping the data/entries? Any further advice will be greatly appreciated. Thanks.



--- On Thu, 9/11/08, Dieter Kluenter <dieter@dkluenter.de> wrote:
From: Dieter Kluenter <dieter@dkluenter.de>
Subject: Re: Help: Slow LDAP search with high %iowait
To: openldap-technical@openldap.org
Date: Thursday, September 11, 2008, 6:54 AM

Hi,

Victor <victorfuman@yahoo.com> writes:

> Would you folks mind sharing some thoughts/ideas on this? Thanks a lot!
>
>
> --- On Mon, 9/8/08, Victor <victorfuman@yahoo.com> wrote:
>
>> From: Victor <victorfuman@yahoo.com>
>> Subject: Help: Slow LDAP search with high %iowait
>> To: openldap-technical@openldap.org
>> Date: Monday, September 8, 2008, 5:35 PM
>> Hi ,
>>
>> I did quite a bit reading and research before I send email
>> to this list for help. If I have missed some basic concepts
>> here, please execuse my ignorance. Thanks for your help and
>> time in advance.

did you read
http://www.openldap.org/faq/data/cache/1075.html
[Vic] Yeah, I read this (and actually almost all faqs) before. Below is how I came up with the BDB cache size:


 

BDB database file

Tree internal page #

page size

total

dn2id.bdb

1335

4096

5472256

id2entry.bdb

118

16384

1949696

objectClass.bdb

1

4096

8192

sn.bdb

819

4096

3358720

givenName.bdb

1060

4096

4345856

mail.bdb

11499

4096

47104000

Total

 

 

62238720



>> 1. Summary
>> The initial search in my prototyping with OpenLDAP (slapd +
>> BDB) seemed to be slow. What is the reason and How could I
>> fix it?
>>
>> 2. Configuration
>> 2.1 Environment
>> Linux CentOS, 1 hard disk (therefore unfortunately the BDB
>> transaction logs and database files are written to the same
>> disk), 120GB disk space (80% unused), 1GB RAM, reserved for
>> this prototyping, OpenLDAP 2.3.39 with default BDB
>> installation

add more RAM to your machine
[Vic] What is the scientific way to determine how much more RAM I need to add?

>> 2.2 slapd.conf (modified trivially for discussion
>> purpose)
>> # global configuration
>> loglevel 0
>>
>> # BDB
>> database bdb
>> suffix "dc=test,dc=dummy,dc=com"
>> rootdn
>> "cn=Manager,dc=test,dc=dummy,dc=com"
>> # Cleartext passwords, especially for the rootdn, should
>> # be avoid. See slappasswd(8) and slapd.conf(5) for
>> details.
>> # Use of strong authentication encouraged.
>> rootpw secret
>> # The database directory MUST exist prior to running slapd
>> AND
>> # should only be accessible by the slapd and slap tools.
>> # Mode 700 recommended.
>> directory /usr/local/var/openldap-data
>> #Other DB configuration
>> idlcachesize 60000
>> cachesize 20000

increase cachesize, add dncachsize, man slapd-bdb(5)
[Vic] Now the configuration is:
idlcachesize 600000
cachesize 200000
dncachesize 400000
>> # Indices to maintain
>> # the indexes are to support search in first name, last
>> name and email for both exact match and wild cards in the
>> end
>> index objectClass eq
>> index gn pres,eq,sub
>> index sn pres,eq,sub
>> index mail pres,eq,sub
>>
>> 2.3 DB_CONFIG (for BDB)
>> set_cachesize 0 52428800 1
>> set_lg_bsize 2097512
>> set_flags DB_LOG_AUTOREMOVE
>> set_lg_regionmax 262144
[Vic] set_cachesize is now:
set_cachesize 0 622387200 1

>> 2.4 Data setup
>> 2 million records (users with gn, sn, email, mobile, street
>> address, etc. in the BDB; all records are indexed using the
>> index in the above slapd.conf; grouped by the first
>> character of lastName. For example,
>> dn: ou=Z,dc=test,dc=dummy,dc=com
>> objectclass: organizationalUnit
>> ou: Z
>
>> Sample LDIF entry:
>> #Directory Entry
>> dn:
>> uid=ABCDEFGHIJKLMNOPQRSTUVWXYZ123459,ou=F,dc=test,dc=dummy,dc=com
>> objectclass: top
>> objectclass: person
>> objectclass: organizationalPerson
>> objectclass: inetOrgPerson
>> uid: ABCDEFGHIJKLMNOPQRSTUVWXYZ123459
>> ...... (details omitted)

What is the size of id2entry.bdb and all indexed >attribute>.bdb
[Vic] Total is ~3GB. Here are the details:
[root@localhost openldap-data]# du -h -k *.bdb
475484 dn2id.bdb
49152 givenName.bdb
1695212 id2entry.bdb
710744 mail.bdb
2628 objectClass.bdb
57088 sn.bdb

>> 3. Symptom/Problem
>>
>> It was very slow in the first (fresh) search if I searched
>> by wildcard firstname only like "Larry*" (which
>> returned 478 entries/users). The response time was generally
>> higher than 5 seconds Depending the count of records found,
>> the response time might exceed 20 or even 50 seconds. During
>> the search, the "iostat" result showed +95%
>> %iowait, await was much higher that svctm, the device %util
>> was over 96%. Here is the "iostat" output:

This might depend on disk type, PATA vs. SATA, read ahead abilities
and so on.

>> However, The subsequent search (using the exact search
>> criteria) is much faster (within 200ms). I believe it is
>> because of the cache.

200 ms is not fast enough, depending on network and server load,
respose values of 4 to 10 ms are reasonable.

-Dieter

--
Dieter Klünter | Systemberatung
http://www.dpunkt.de/buecher/2104.html
GPG Key ID:8EF7B6C6
53°08'09,95"N
10°08'02,42"E