Hi Howard,
Thanks for the help :D
We have been testing in ramdisk as well, to make sure that disk thrashing is not the root cause.
If your searches are not running long enough to show up for profiling, increase the number of second level entries until you get something you can profile.
Thanks
Tim
On 11/11/10 21:38, Howard Chu wrote:
Tim Dyce wrote:
Hi Dieter,
Thanks for the tips on tuning, sadly the problem is still haunting us :(
Andrey Kiryanov at CERN has been doing a lot of work on this performance degradation problem as well. He has tried BDB 4.8.30 and OpenLDAP 2.4.23 but the problem is still apparent.
I've run the test setup you provided here http://www.openldap.org/lists/openldap-technical/201010/msg00237.html
but so far I'm seeing constant (0:00.0 second) results from ldapsearch.
Some differences - I used back-hdb, which is going to be superior for a heavy add/delete workload. Also my test DB is running on a tmpfs (RAMdisk).
The basic test we are running (sent earlier) creates 100 ou entries in the root, each with 250 child ou entries, then deletes 20-35% of these and re-adds them. For each deletion cycle the ldapsearch performance degrades, taking longer to complete the search each time.
The performance is consistent, across restarts of slapd, and tied to the current state of the database. I have tried rsyncing out the database, and returning it later, and the performance is consistent with the number of deletion cycles the database has undergone.
The only clue I have is that when dumping the databases which db_dump it's clear that the ordering of the database becomes increasingly less aligned with the order of the output data when doing a full tree search as we are. Which suggests that the database is writing frequently accessed entires too often instead of holding them in cache?
I have run cachegrind against the server at 2, 20 and 1000 deletion iterations and the results are very different - http://www.ph.unimelb.edu.au/~tjdyce/callgrind.tar.gz The number of fetches grows massively over time.
Anything you guys can suggest would be much appreciated, it's started to affect quite a number of our grid sites.
Cheers,
Tim
On 04/11/10 02:56, Dieter Kluenter wrote:
Hi Dieter,
I've done some more testing with openldap 2.3 and 2.4, on Redhat and Ubuntu. I even went as far as placing the BDB database directory in a ramdisk. But the performance still seems to degrade over time as data is added then deleted repeatedly from the ldap server.
It looks like the BDB database starts to fragment or lose structure over time? I've tried a few DB options that seem to have some impact.
Any ideas on what I can do from here?
Quite frankly, I have no clue, all i can do is guessing. First let's define the problem: you have measured the presentation of search Results the client side, and you observered an increase of time required to present the results. Mostlikely it is either a caching problem, a disk problem or a network problem. As far as openldap related, there are four caches to watch:
- the bdb/hdb database (DB_CONFIG, cachesize)
- the DN cache (dncachesize)
- the cache of searched and indexed attribute types (idlcachesize)
- the frontside cache of search results (cachesize)
please check slapd.conf whether appropriate sizes are configured, see man slapd-bdb(5) and slapd.conf(5) for more information.
But I must admit, a misconfiguration of any of this caches would not lead to such a degrading in presenting search results.
An other approach would be to check the caching behaviour of clients, to check the network cache and the disk cache.
-Dieter