/* There are two approaches to the search function: * 1) walk the list of filter candidates, and see which are in scope. * 2) walk the scope tree, and see which are in the filter candidates. * (1) is faster if the filter candidate list is smaller than the scope tree, * and vice versa. Currently only (1) is implemented. * * We don't know the actual size of the subtree, we only know the count of * onelevel children. For subtree searches we take a guess. * * Due to the fact pagedResults cookie is only big enough to store a single * entryID, if pagedResults is in use we will always do (1). For (2) to work * reliably we need to use the parent's entryID, to avoid losing our place * if the target entry is moved or deleted. We'd also need to store the * count of where we were under the parent. I.e., we need the cookie to be * two words long. We can examine that in the future, but the pagedResults * cookie definition is global to all of slapd so the change would impact * other backends and overlays. */
We could alter the dn2id index format, and maintain a numSubordinates counter there. That would avoid the need to guess. Otherwise I figure we fudge a guess based on the number of onelevel children and the depth of the baseDN (relative to the suffix). I.e., the longer the DN, the smaller the guess, relative to the total number of entries in the DB. Naturally I'd prefer to avoid altering the dn2id index format.
Suggestions?
Le 1/11/13 1:51 PM, Howard Chu a écrit :
/* There are two approaches to the search function: * 1) walk the list of filter candidates, and see which are in scope. * 2) walk the scope tree, and see which are in the filter
candidates. * (1) is faster if the filter candidate list is smaller than the scope tree, * and vice versa. Currently only (1) is implemented. * * We don't know the actual size of the subtree, we only know the count of * onelevel children. For subtree searches we take a guess. * * Due to the fact pagedResults cookie is only big enough to store a single * entryID, if pagedResults is in use we will always do (1). For (2) to work * reliably we need to use the parent's entryID, to avoid losing our place * if the target entry is moved or deleted. We'd also need to store the * count of where we were under the parent. I.e., we need the cookie to be * two words long. We can examine that in the future, but the pagedResults * cookie definition is global to all of slapd so the change would impact * other backends and overlays. */
We could alter the dn2id index format, and maintain a numSubordinates counter there. That would avoid the need to guess. Otherwise I figure we fudge a guess based on the number of onelevel children and the depth of the baseDN (relative to the suffix). I.e., the longer the DN, the smaller the guess, relative to the total number of entries in the DB. Naturally I'd prefer to avoid altering the dn2id index format.
ApacheDS stores the number of children, and the bumber of descendants alltogether, in the dn2id table. That allows use to use the scope as the primary filter for candidates. The extra cost is when you add/delete/move an entry, you have to update all the dn2id (in OpenLDAP terminology) parents from the modified entry down to the root. That's a price we accept to pay.
Once you have this number, you can use either (1) or (2) to start with. At least, you are sure to pick the smallest set of candidates.
Before that, we had a OneLevel index and a SubLevel index which were containing the number of children/descendant for each entry. This is a way to avoid modifying the dn2id table, but that's two index you have to update everytime you do an add/del/mod. We get rid of those two tables recently, gaining some time and disl space in the process.
Howard Chu wrote:
We could alter the dn2id index format, and maintain a numSubordinates counter there.
Would this make it easier to implement this?
http://tools.ietf.org/html/draft-boreham-numsubordinates-01
Ciao, Michael.
Le 1/11/13 7:48 PM, Michael Ströder a écrit :
Howard Chu wrote:
We could alter the dn2id index format, and maintain a numSubordinates counter there.
Would this make it easier to implement this?
Implementing this RFC is just a side effect of having stored the number of subordinates stored in the dn2id index (or somehwere else).
So adding the operational attribute in each returned entry is pretty straight forward as soon as you have the value at hand.
Emmanuel Lécharny wrote:
Le 1/11/13 7:48 PM, Michael Ströder a écrit :
Howard Chu wrote:
We could alter the dn2id index format, and maintain a numSubordinates counter there.
Would this make it easier to implement this?
Implementing this RFC is just a side effect of having stored the number of subordinates stored in the dn2id index (or somehwere else).
So adding the operational attribute in each returned entry is pretty straight forward as soon as you have the value at hand.
In fact, since numSubordinates only reports the number of immediate (i.e. onelevel) children, it could be implemented already. Back-mdb always has the exact count of onelevel children for any entry. (back-hdb could also, but I think this counting operation is slow in BerkeleyDB.)
Howard Chu wrote:
Emmanuel Lécharny wrote:
Le 1/11/13 7:48 PM, Michael Ströder a écrit :
Howard Chu wrote:
We could alter the dn2id index format, and maintain a numSubordinates counter there.
Would this make it easier to implement this?
Implementing this RFC is just a side effect of having stored the number of subordinates stored in the dn2id index (or somehwere else).
So adding the operational attribute in each returned entry is pretty straight forward as soon as you have the value at hand.
In fact, since numSubordinates only reports the number of immediate (i.e. onelevel) children, it could be implemented already. Back-mdb always has the exact count of onelevel children for any entry. (back-hdb could also, but I think this counting operation is slow in BerkeleyDB.)
Would be nice to have that feature even when limited to back-mdb.
Ciao, Michael.