Howard,
I download the new HEAD version and made some testing. Now the olmBDBEntryCache is following the cache configuration.
I also made some small change in slapd.conf including cachefree parameter. Please see below in the end how slapd is configured related to bdb cache :
line 123 (cachesize 1000) line 124 (cachefree 1000) line 125 (idlcachesize 1000) line 126 (dncachesize 1000000)
I increase the dncachesize since with a small value the search takes considerable more time.
First I tested with a value greater than the value that was being consumed before without the memory boundary. In this way I was expecting some order of magnitude as before, what was achieved (line 126 (dncachesize 5000000)):
BEFORE CACHED: 1000000
real 5m23.084s user 0m28.026s sys 0m6.686s
Then Cache has :
olmBDBEntryCache: 1001 olmBDBDNCache: 4000264 olmBDBIDLCache: 1
AFTER CACHED:
1000000
real 2m31.623s user 0m28.145s sys 0m8.637s
I would like to reinforce this test above was using a dncachesize knew that can store all DN's from my DB. This follow the same behavior as before where all DB can be stored into memory without slapd respecting the configuration boundaries(like it).
In my system I will have dn's that compose the entrance. Since my filter is for only one of the dn I was expecting this cache to be only related with the filter and then in 1,000,000.
So I'm expecting something greater than 6 minutes since for the first time system caches the information, or in other way used disk to read DB, the time was around 6 minutes.
With a cache smaller than 4 million entrances, I was expecting just a little more overhead since it would required slapd to flush cache and start to load the entrances more than once. In any case I would not expect something greater than 6 minutes since to flush memory and load from pieces should not cause too much overhead.
Only following queries would not take fully advantage, like half time at further tries, since the cache cannot keep all information.
But some strange things, by my perspective, happened. The dncache grew normally until reach the 1000000 value with cachesize following the boundary 1000 :
readOnly: FALSE olmBDBEntryCache: 1001 olmBDBDNCache: 957271 olmBDBIDLCache: 1
But after the dncache limit is reached, by monitor, the cache became :
olmBDBEntryCache: 1 olmBDBDNCache: 1000262 (this is a very deterministic number) olmBDBIDLCache: 1
Getting stuck here forever.
Also the performance degrades considerable. From this point on I let all night and the DB was not fully passed. This means that after the DN cache enter in restrictions, even a lower performance would be expected since disk access should be more often, it becomes unpractical.
I needed to stop the slapd so I could see how many entrances were parsed. This return a number around 250,000, meaning around 1/4 as the DN size. This appears to be that after cache boundary is reached for some reason the DB parse doesn't evolve anymore.
I lost the correct time information from my last night query running but I can send another query I did this morning with some similar information. I can say it took more than 8 hours without moving very further than 250,000 entrance(of 1,000,000).
ldap_result: Can't contact LDAP server (-1) 251512
real 121m51.697s user 0m7.327s sys 0m1.734s
This is very deterministic and normally get stuck around this value without much improvement. I do not believe query will end. This was the reason I needed to kill the slapd process to have an order of magnitude the search had reached since this search is never ending.
Something still looking with problems since performance is too much degraded even the disk would be more in use. Looks like search gets stuck after cache boundaries are reached.
Also looks like the entrance cache becomes 1(olmBDBEntryCache) after dn limit is reached.
Best Regards,
Rodrigo.
--- On Sun, 1/25/09, Howard Chu hyc@symas.com wrote:
From: Howard Chu hyc@symas.com Subject: Re: (ITS#5860) slapd memeory leak under openldap 2.4 To: rlvcosta@yahoo.com Cc: openldap-its@OpenLDAP.org Date: Sunday, January 25, 2009, 7:46 PM Rodrigo Costa wrote:
Howard,
I tested the HEAD load last night. The results I
believe are partial.
Thanks for the feedback, please try the new patch in HEAD.
The memory stay stable and the use was much lower. I
had in my slapd.conf the following cache boundaries :
line 122 (cachesize 1000) line 123 (idlcachesize 1000) line 124 (dncachesize 2000)
Which make with the new load to consume much less
memory and stay stable.
PID USER PR NI VIRT RES SHR S %CPU %MEM
TIME+ COMMAND
20715 ldap 18 0 184m 73m 68m S 0 0.6
174:05.81 slapd
But by the other hand I use the monitor information to
check the cache usage. I would notice that now dncache is keeping around the boundary I put of 20000 but the cache is now fixed as 1, not 1000. See information below :
readOnly: FALSE olmBDBEntryCache: 1 olmBDBDNCache: 2185 olmBDBIDLCache: 1 olmDbDirectory: /var/openldap-data/bdb2/ entryDN: cn=Database 2,cn=Databases,cn=Monitor
This kept around this all the time I was making a
search in DB.
Before this load with this possible DN cache boundary
solution I made the ldapsearch in around 7 minutes, for the first time not cached yet. And around 4 minutes after the entrances were cached.
Now with this new load I made the search and it took :
1000000
real 174m6.486s user 0m43.746s sys 0m16.923s
Almost 3 hours. It's ok that I didn't tune the
cache sizes but I was at least expecting the olmBDBEntryCache would now follow the 1000 boundary and not single 1.
I believe this is what is not making this huge
performance difference and not the olmBDBDNCache.
Please confirm this is some tuning missing or if
something else like the olmBDBEntryCache would be improved.
In terms of memory usage now it is respecting but the
performance, probably based in the olmBDBEntryCache only using 1 position, is much worst.
Please also let me know if you think this performance
degradation and the olmBDBEntryCache marking 1 is a tuning issue or something still need to be modified in the code.
Thanks for the fast solution for this issue since this
would be affecting all systems and types of DB.
Rodrigo.
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/