I do not think the ITS is completely solved. Let me try to explain better.
I have the system respecting now the DN cachesize boundary. At the same time I wasn't sure about the cachefree idea and now I changed to 1, or equal the default value.
1) I have the system doing sequential search in a reasonable speed until the dncachesize is reached;
2) After dncchesize is reached the sequential search hangs, or the output from the search get stuck for a long time(I'm forwarding to a file so I do not have screen actualization delays);
Ex:
[root@brtldp11 backup]# date; cat temp.ldif |grep -e '^pnnum*' |wc -l
Mon Jan 26 17:22:21 BRST 2009
250016
[root@brtldp11 backup]# date; cat temp.ldif |grep -e '^pnnum*' |wc -l
Mon Jan 26 17:23:00 BRST 2009
250016
See above that even almost after 1 minute passed not new LDIF entrance was included.
3) Then query get stuck and looks like deterministic it time by time only dumps 16 entrances and get stuck. This behavior repeats and with these stuck that sometime gets minutes the query never ends.
Even with cachesize as 1000 and cachefree as 1, the olmBDBEntryCache continues to decrease, just slow now.
I was expecting that a cachefree as 1000 would purge all entrances and then cache again all 1000 new with in sequence always answering the search. So the search would never hangs like it is happening now.
It get stuck and will never ends since it responds only 16 entrances in order of minutes and in bursts.
The previous load was more reasonable than now since even taking much longer it would end the search.
[root@brtldp11 backup]# date; cat temp.ldif |grep -e '^pnnum*' |wc -l
Mon Jan 26 17:31:36 BRST 2009
250128
Rodrigo.
From: Howard Chu hyc@symas.com
Subject: Re: (ITS#5860) slapd memeory leak under openldap 2.4
To: rlvcosta@yahoo.com
Cc: openldap-its@OpenLDAP.org
Date: Monday, January 26, 2009, 5:18 PM
Rodrigo Costa wrote:
Howard,
I download the new HEAD version and made some testing.
Now the
olmBDBEntryCache is following the cache configuration.
Good, then this ITS is resolved. Usage questions should be
directed to the
-software mailing list.
I also made some small change in slapd.conf including
cachefree
configuration. Please see below in the end how slapd is
configured related to
bdb cache :
line 123 (cachesize 1000)
line 124 (cachefree 1000)
line 125 (idlcachesize 1000)
line 126 (dncachesize 1000000)
Setting cachefree equal to cachesize will effectively cause
the entire entry
cache to be dumped each time it reaches its maximum size.
That's clearly not a
good idea.
I increase the dncachesize since with a small value
the search takes
considerable more time.
First I tested with a value greater than the value
that was consuming
before
without the memory boundary. In this way I was expecting
some order of
magnitude as before, what it achieved (line 126
(dncachesize 5000000)):
BEFORE CACHED:
1000000
real 5m23.084s
user 0m28.026s
sys 0m6.686s
Then Cache has :
olmBDBEntryCache: 1001
olmBDBDNCache: 4000264
olmBDBIDLCache: 1
AFTER CACHED:
1000000
real 2m31.623s
user 0m28.145s
sys 0m8.637s
Just to let clear these tests above where with the new
logic and where the
DN Cache Size is bigger than the final number I had with
the old logic where
all information is cached into memory.
For each entrance I have 4 dn's that compose the
entrance. Since my filter
is for only one of the dn I was expecting this cache to be
only related with
the filter and then in 1,000,000.
Then I change the dncachesize to 1000000 (line 126
(dncachesize 1000000) :
The total time for a search using a filter for one
index dn became :
ldap_result: Can't contact LDAP server (-1)
255801
real 469m26.619s
user 0m7.027s
sys 0m1.756s
I needed to kill the server so I would have an idea
about how many
entrances
it searched. The time is too long and I'm not sure if
it would even end. I let
all night but the process did not end. This was the same
ldapsearch as when
all entrances would be allocated in a DN cache into memory.
But the cache process was something I wasn't
expecting. Before the
dncachesize was reached I had at monitor :
olmBDBEntryCache: 1001
olmBDBDNCache: 921296
olmBDBIDLCache: 1
Then after the boundary was reached it became :
olmBDBEntryCache: 1
olmBDBDNCache: 1000262
olmBDBIDLCache: 1
Where these numbers didn't change anymore. Not
sure why after the
dncachesize is reached the cachesize(or olmBDBEntryCache)
became 1.
Because you set cachefree to 1000. 1001 - 1000 = 1.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP
http://www.openldap.org/project/