Howard,
Now it fits under my expectation. Under the initial load without the fix I had slapd not respecting any boundary and then all DN information being cached.
In this way the first query would take longer since the information wasn't yet cached. At that time I had, without cache, a search for around 1million entrances in around 7 minutes and after cached this time reduce to around half.
So now with the new load I was expecting something a little longer as the initial not cached since now slapd will not be able to cache all information and so the cache will be often overwritten.
Now in my tests I obtained :
First time: real 6m50.017s user 0m17.346s sys 0m9.159s
Second time: real 6m47.790s user 0m17.032s sys 0m9.137s
Third time: real 7m17.459s user 0m17.412s sys 0m9.320s
This is more than perfect. Amazing how you guys could solve this problem so quickly.
I saw sometimes the DN cache allocating a little more memory since it sometimes pass a little the boundary, like seen below :
olmBDBEntryCache: 1000 olmBDBDNCache: 1000247 olmBDBIDLCache: 1
The maximum I saw was 1000268 but nothing considerable. In this way now slapd is keeping memory allocation stable as it is supposed to behave.
The only comment is if dn cache is passed this memory allocated appears to never be released. The maximum size slapd obtained was :
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 10195 ldap 18 0 385m 266m 68m S 0 2.2 20:54.16 slapd
This is more than perfect !
Thanks for all your quick solution of this problem. I believe many users of openLDAP will see their system more stable with this fix.
Just some last questions. Is there any expectation for a new release, for example 2.4.15, with this ITS included? Just to use a formal release.
By the tests I was expecting the cachefree as the cache size bringing some improvements, at least in a full query, since system would "cycle" pointer memory and overwritten all cache at once. What is the purpose of cachefree?
Just to understand better and then tune later with some better idea about cache purposes.
Best Regards,
Rodrigo.
--- On Mon, 1/26/09, Howard Chu hyc@symas.com wrote:
From: Howard Chu hyc@symas.com Subject: Re: (ITS#5860) slapd memeory leak under openldap 2.4 To: rlvcosta@yahoo.com Cc: openldap-its@OpenLDAP.org Date: Monday, January 26, 2009, 6:00 PM Rodrigo Costa wrote:
Howard,
I do not think the ITS is completely solved. Let me
try to explain better.
I have the system respecting now the DN cachesize
boundary. At the same time I wasn't sure about the cachefree idea and now I changed to 1, or equal the default value.
In any case what happens is :
- I have the system doing sequential search in a
reasonable speed until the dncachesize is reached;
- After dncchesize is reached the sequential search
hangs, or the output from the search get stuck for a long time(I'm forwarding to a file so I do not have screen actualization delays);
Ex: [root@brtldp11 backup]# date; cat temp.ldif |grep -e
'^pnnum*' |wc -l
Mon Jan 26 17:22:21 BRST 2009 250016 [root@brtldp11 backup]# date; cat temp.ldif |grep -e
'^pnnum*' |wc -l
Mon Jan 26 17:23:00 BRST 2009 250016 See above that even almost after 1 minute passed not
new LDIF entrance was included.
- Then query get stuck and looks like deterministic
it time by time only dumps 16 entrances and get stuck. This behavior repeats and with these stuck that sometime gets minutes the query never ends.
olmBDBEntryCache: 884 olmBDBDNCache: 1000261 olmBDBIDLCache: 1
olmBDBEntryCache: 611 olmBDBDNCache: 1000261 olmBDBIDLCache: 1
Even with cachesize as 1000 and cachefree as 1, the
olmBDBEntryCache continues to decrease, just slow now.
I was expecting that a cachefree as 1000 would purge
all entrances and then
cache again all 1000 new with in sequence always
answering the search. So
the search would never hangs like it is happening now.
I see. OK, this is now fixed in HEAD.
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/