Howard,
I download the latest HEAD and compiled for testing.
By my test it looks like slapd isn't anymore hanging but there still some performance issues. I continued with the same 3,000,000 dncachesize boundary and a DB with around 4,000,000 records. The ldapsearch run ok with a pace of more than 1500 records read by second until it reaches the dncachesize boundary where this pace reduce to less than 20 records per second.
Since a record not cached, like starting a ldapsearch for the first time, the limiting factor would be the disk I/O. I was expecting for a record not cached, even filled up the dncache, the pace would remain very similar since only disk I/O would be the limiting factor. I monitor system resources and after dncache filled I wouldn't see any increase in disk I/O or any other HW liniting factor that could leave to this considerable drop.
The system took around 20minutes to reach the 3,000,000 records search but it would then take more than 10 hours to finish the ending 1,000,000.
What is more strange is different from the official openLDAP 2.4.16, if I stop the ldapsearch and start a new one, even there are now records in memory, the pace continues in the around 20 records per second.
This creates performance issues since looks like the system enters in a cache controlled state where records are read(by logic) very slowly, not a system resource(like HW) limitation. Please see below some tests I did where these paces can be seen :
[root@brtldp12 ~]# date;cat /backup/test_temp_CONTENT.ldif|egrep -e '^pnnumber' |wc -l;sleep 1;date;cat /backup/test_temp_CONTENT.ldif|egrep -e '^pnnumber' |wc -l Wed Jun 17 00:27:14 BRT 2009 224 Wed Jun 17 00:27:15 BRT 2009 246
[root@brtldp12 ~]# date;cat /backup/test_temp_CONTENT.ldif|egrep -e '^pnnumber' |wc -l;sleep 1;date;cat /backup/test_temp_CONTENT.ldif|egrep -e '^pnnumber' |wc -l Wed Jun 17 00:28:03 BRT 2009 3089 Wed Jun 17 00:28:04 BRT 2009 4700
I used the mated machine(replication) that didn't fill the cache and also started a new ldapsearch in the master 1 machine to show the pace continue in a slow rate.
Not sure if this was the behavior expected since after cache is filled system start to respond in a very slow rate even there are already cached records that would speed things up.
Could this be a configuration issue? I do not think but I'm putting below my cache configuration :
#Cache values cachesize 10000 dncachesize 3000000 idlcachesize 10000 cachefree 10
Thanks a lot!
Rodrigo.
Howard Chu wrote:
Rodrigo Costa wrote:
OpenLdap group,
I'm having a possible issue that could be a problem. I have a DB with around 4 million entrances. In my slapd.conf I use the following cache constraints :
Any comments if this could be a configuration issue or some other related issue? Would this be a ITS?
A number of dncache issues have been fixed already in CVS.
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/ -- Delft Hydraulics, GeoDelft, the Subsurface and Groundwater unit of TNO and parts of Rijkswaterstaat have joined forces in a new independent institute for delta technology, Deltares. Deltares combines knowledge and experience in the field of water, soil and the subsurface. We provide innovative solutions to make living in deltas, coastal areas and river basins safe, clean and sustainable.
DISCLAIMER: This message is intended exclusively for the addressee(s) and may contain confidential and privileged information. If you are not the intended recipient please notify the sender immediately and destroy this message. Unauthorized use, disclosure or copying of this message is strictly prohibited. The foundation 'Stichting Deltares', which has its seat at Delft, The Netherlands, Commercial Registration Number 41146461, is not liable in any way whatsoever for consequences and/or damages resulting from the improper, incomplete and untimely dispatch, receipt and/or content of this e-mail.