Re: (ITS#5221) cache? of parent failes for hdb
by hyc@symas.com
Dan Oscarsson wrote:
> tis 2007-11-20 klockan 07:09 -0800 skrev Howard Chu:
>> Dan.Oscarsson(a)tietoenator.com wrote:
>>> If this looks correct to you, what code should I add to fix it?
>>> It would be better if one of you who knows the code better than me could
>>> do that. I can test and see if it works.
>
>> Thanks for your investigation; this explanation makes sense. You can test this
>> simply by disabling the statement in bdb_cache_modrdn() which sets the NO_KIDS
>> flags:
>
> I have tested and it fixed my reduced test case. Running full test still
> failes.
> So I suspected it might have to do with CACHE_ENTRY_NO_GRANDKIDS
> and tested by removing the test for CACHE_ENTRY_NO_GRANDKIDS in dn2id.c
> ***************
> *** 1044,1058 ****
> if ( cx->prefix == DN_SUBTREE_PREFIX ) {
> bdb_idl_append( cx->ids, cx->tmp );
> cx->need_sort = 1;
> ! /*if ( !(cx->ei->bei_state & CACHE_ENTRY_NO_GRANDKIDS)) {*/
> ! {
>
> to get it to go down in the tree.
>
> This time even my full test works, from what I can see. Though this may
> not be the correct way to do it, and I do not know if something else
> may go wrong.
> It looks like the setting and usage of CACHE_ENTRY_NO_GRANDKIDS and
> CACHE_ENTRY_NO_KIDS have to be looked over. From what I can see they are
> only used in cache.c, dn2id.c and modrdn.c in back-bdb/hdb.
> Should I try something else?
I think that's enough to confirm the problems. I won't be able to work on the
fix for a few hours; will update here when I have something ready.
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
16 years
Re: (ITS#5221) cache? of parent failes for hdb
by Dan.Oscarsson@tietoenator.com
tis 2007-11-20 klockan 07:09 -0800 skrev Howard Chu:
> Dan.Oscarsson(a)tietoenator.com wrote:
> > If this looks correct to you, what code should I add to fix it?
> > It would be better if one of you who knows the code better than me could
> > do that. I can test and see if it works.
> >
> Thanks for your investigation; this explanation makes sense. You can test this
> simply by disabling the statement in bdb_cache_modrdn() which sets the NO_KIDS
> flags:
I have tested and it fixed my reduced test case. Running full test still
failes.
So I suspected it might have to do with CACHE_ENTRY_NO_GRANDKIDS
and tested by removing the test for CACHE_ENTRY_NO_GRANDKIDS in dn2id.c
***************
*** 1044,1058 ****
if ( cx->prefix == DN_SUBTREE_PREFIX ) {
bdb_idl_append( cx->ids, cx->tmp );
cx->need_sort = 1;
! /*if ( !(cx->ei->bei_state & CACHE_ENTRY_NO_GRANDKIDS)) {*/
! {
to get it to go down in the tree.
This time even my full test works, from what I can see. Though this may
not be the correct way to do it, and I do not know if something else
may go wrong.
It looks like the setting and usage of CACHE_ENTRY_NO_GRANDKIDS and
CACHE_ENTRY_NO_KIDS have to be looked over. From what I can see they are
only used in cache.c, dn2id.c and modrdn.c in back-bdb/hdb.
Should I try something else?
Regards,
Dan
--
Dan Oscarsson
TietoEnator Email: Dan.Oscarsson(a)tietoenator.com
Box 85
201 20 Malmo, Sweden
16 years
Re: (ITS#5235) ppolicy leads to segfault
by dieter@dkluenter.de
Quanah Gibson-Mount <quanah(a)zimbra.com> writes:
> --On November 15, 2007 9:22:41 PM +0000 dieter(a)dkluenter.de wrote:
>
>> Full_Name: Dieter Kluenter
>> Version: HEAD
>> OS:
>> URL: ftp://ftp.openldap.org/incoming/
>> Submission from: (NULL) (84.142.231.219)
>
> Dieter,
>
> Can you confirm one way or the other if the fix in HEAD from
> Pierangelo solved your issue?
I just did a cvs update of HEAD and did a few tests. As I get now
errors it seems that this ITS is solved now.
-Dieter
--
Dieter Klünter | Systemberatung
http://www.dkluenter.de
GPG Key ID:8EF7B6C6
16 years
Re: (ITS#5235) ppolicy leads to segfault
by quanah@zimbra.com
--On November 15, 2007 9:22:41 PM +0000 dieter(a)dkluenter.de wrote:
> Full_Name: Dieter Kluenter
> Version: HEAD
> OS:
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (84.142.231.219)
Dieter,
Can you confirm one way or the other if the fix in HEAD from Pierangelo
solved your issue?
Thanks,
Quanah
--
Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc
--------------------
Zimbra :: the leader in open source messaging and collaboration
16 years
Re: (ITS#5221) cache? of parent failes for hdb
by hyc@symas.com
Dan.Oscarsson(a)tietoenator.com wrote:
> If this looks correct to you, what code should I add to fix it?
> It would be better if one of you who knows the code better than me could
> do that. I can test and see if it works.
>
> I hope this is the only place in cache handling of an entries children,
> though maybe someone with better knowledge on the code can identify
> others.
>
> Hope this is the bug as I have used many days to trace it down and need
> to do some normal work for my company.
Thanks for your investigation; this explanation makes sense. You can test this
simply by disabling the statement in bdb_cache_modrdn() which sets the NO_KIDS
flags:
Index: cache.c
===================================================================
RCS file: /repo/OpenLDAP/pkg/ldap/servers/slapd/back-bdb/cache.c,v
retrieving revision 1.157
diff -u -r1.157 cache.c
--- cache.c 12 Nov 2007 10:41:45 -0000 1.157
+++ cache.c 20 Nov 2007 15:09:29 -0000
@@ -1148,8 +1148,10 @@
free( ei->bei_nrdn.bv_val );
ber_dupbv( &ei->bei_nrdn, nrdn );
+#if 0
if ( !pei->bei_kids )
pei->bei_state |= CACHE_ENTRY_NO_KIDS | CACHE_ENTRY_NO_GRANDKIDS;
+#endif
#ifdef BDB_HIER
free( ei->bei_rdn.bv_val );
Please try that and report your results. I don't think this is the best fix,
but if it helps your test then we can refine it from there.
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
16 years
Re: (ITS#5221) cache? of parent failes for hdb
by Dan.Oscarsson@tietoenator.com
mån 2007-11-12 klockan 15:15 -0800 skrev Quanah Gibson-Mount:
> --On Monday, November 12, 2007 7:02 AM +0000 hyc(a)symas.com wrote:
> >
> > This isn't a lot of information to go on. If you can create a test
> > program that shows the problem occurring, using dummy data, that would
> > help. --
>
> Also, Just some general data on what it is you are doing that is a bit more
> explanative.
I have now done several days of testing and think I have tracked what is
wrong. All my tests have been done in 2.3.38.
What my program going to do is: move one person from one branch to an
other one. It does the following (in a simplified way):
1) search for the person entry using base o=xxx and filter uid=yyyy
2) does a modrdn from cn=qqq+uid=yyy,a=aaaa,b=bbbb,c=cccc,o=xxx
to cn=qqq+uid=yyy,d=dddd,e=eeee,c=cccc,o=xxx
I do this now on a newly started ldap server (that is cache have not
been filled). This is a special case that I found triggered what is
probably the bug I got previously but then the server may have been
running for a long time.
My analysis from all my debug prints indicate that:
during 1) above the person entry is located and
hdb_cache_find_parent is called which calls hdb_dn2id_parent to find the
way to the root. From what I can see this constructs cache entries with
one kid entry in bei_kids. It does not load all kids of each entry found
along the path to root.
Next in 2) modrdn is called and bdb_cache_modrdn. This removes the
person entry from the a=aaaa entry. As the a=aaaa entry was cached
during 1) with just setting one child instead of loading all from
disk, that entry now has no children (bei_kids is NULL) so the state is
set to CACHE_ENTRY_NO_KIDS.
If I after this does a search with base c=cccc,o=xxx the hdb_dn2idl
routine will not find all entries as the cached entry of
a=aaaa,b=bbbb,c=cccc,o=xxx in hdb_dn2idl_internal has state
CACHE_ENTRY_NO_KIDS and is ignored.
If modrdn is step 2) before deleting the entry from its parents list of
kids, had loaded all kids from disk, is should have worked.
So the problem is, it my analysis is correct, is that sometimes cache
entries are created which have not loaded the children from disk and
then an other cache routine change the number of children in the cache
without first loading the correct number of children from disk.
If this looks correct to you, what code should I add to fix it?
It would be better if one of you who knows the code better than me could
do that. I can test and see if it works.
I hope this is the only place in cache handling of an entries children,
though maybe someone with better knowledge on the code can identify
others.
Hope this is the bug as I have used many days to trace it down and need
to do some normal work for my company.
Regards,
Dan
--
Dan Oscarsson
TietoEnator Email: Dan.Oscarsson(a)tietoenator.com
Box 85
201 20 Malmo, Sweden
16 years