I'd like to understand how indexes are used with search. I've found some info http://www.openldap.org/doc/admin24/tuning.html#Understanding how a search works http://www.openldap.org/doc/admin24/tuning.html#Understanding%20how%20a%20search%20works but that's not very precise.
Suppose there is a search like (&(attr1=value1)(attr2=value2)), attr1 and attr2 are indexed. As far as I understand, this search fetches objectIDs from the index on attr1, than the objectIDs from the index on attr2, than, according to operation "and" ("&"), objectIDs that present in BOTH sets are determined (using some set-intersection algorithm). When the search is "or" (|(...)(...)), the objectIDs from both sets are joined and duplicates are removed (using some set-union algorithm). If there is no index on attribute, the resulting objectID set is determined by bare iterate-and-compare on all the objects, or so. Am I right?
Could anyone advice some docs on this?
Roman Rybalko wrote:
I'd like to understand how indexes are used with search. I've found some info http://www.openldap.org/doc/admin24/tuning.html#Understanding how a search works http://www.openldap.org/doc/admin24/tuning.html#Understanding%20how%20a%20search%20works but that's not very precise.
Suppose there is a search like (&(attr1=value1)(attr2=value2)), attr1 and attr2 are indexed. As far as I understand, this search fetches objectIDs from the index on attr1, than the objectIDs from the index on attr2, than, according to operation "and" ("&"), objectIDs that present in BOTH sets are determined (using some set-intersection algorithm). When the search is "or" (|(...)(...)), the objectIDs from both sets are joined and duplicates are removed (using some set-union algorithm). If there is no index on attribute, the resulting objectID set is determined by bare iterate-and-compare on all the objects, or so. Am I right?
That's correct.
25.09.2012 14:37, Howard Chu пишет:
Roman Rybalko wrote:
Suppose there is a search like (&(attr1=value1)(attr2=value2)), attr1 and attr2 are indexed. As far as I understand, this search fetches objectIDs from the index on attr1, than the objectIDs from the index on attr2, than, according to operation "and" ("&"), objectIDs that present in BOTH sets are determined (using some set-intersection algorithm). When the search is "or" (|(...)(...)), the objectIDs from both sets are joined and duplicates are removed (using some set-union algorithm). If there is no index on attribute, the resulting objectID set is determined by bare iterate-and-compare on all the objects, or so.
That's correct.
Thanks!
Also it's not clear for me how objectID set is determined when there is search on "one" scope? The same when "sub" scope cover not the full database contents but a part of the object tree (suffix: cn=test, search base: cn=test1,cn=test)?
I thought that search with base covering not the full backend suffix (but a partial tree) should work faster, but seems this is not true, even overhead introduced by additional objectID set. Right?
25.09.2012 14:57, Roman Rybalko пишет:
Also it's not clear for me how objectID set is determined when there is search on "one" scope? The same when "sub" scope cover not the full database contents but a part of the object tree (suffix: cn=test, search base: cn=test1,cn=test)? I thought that search with base covering not the full backend suffix (but a partial tree) should work faster, but seems this is not true, even overhead introduced by additional objectID set. Right?
Howard, can you explain it please?
Roman Rybalko wrote:
25.09.2012 14:57, Roman Rybalko пишет:
Also it's not clear for me how objectID set is determined when there is search on "one" scope? The same when "sub" scope cover not the full database contents but a part of the object tree (suffix: cn=test, search base: cn=test1,cn=test)? I thought that search with base covering not the full backend suffix (but a partial tree) should work faster, but seems this is not true, even overhead introduced by additional objectID set. Right?
Howard, can you explain it please?
The scope is just another set, yes. It must be ANDed with the index sets. Searches with subtree scope over the entire database use a shortcut, instead of reading a scope set we just use a fake set representing all IDs. Searches over any other scope cannot take any shortcut.
Le 9/26/12 2:01 PM, Howard Chu a écrit :
Roman Rybalko wrote:
25.09.2012 14:57, Roman Rybalko пишет:
Also it's not clear for me how objectID set is determined when there is search on "one" scope? The same when "sub" scope cover not the full database contents but a part of the object tree (suffix: cn=test, search base: cn=test1,cn=test)? I thought that search with base covering not the full backend suffix (but a partial tree) should work faster, but seems this is not true, even overhead introduced by additional objectID set. Right?
Howard, can you explain it please?
The scope is just another set, yes. It must be ANDed with the index sets. Searches with subtree scope over the entire database use a shortcut, instead of reading a scope set we just use a fake set representing all IDs. Searches over any other scope cannot take any shortcut.
As discussed with Howard last month, OneLevel scope can define a set of candidate IDs too, as soon as you can extract the children of the base entry from the Rdn index. Not sure OpenLDAP support that atm. Howard ?
Emmanuel Lécharny wrote:
Le 9/26/12 2:01 PM, Howard Chu a écrit :
Roman Rybalko wrote:
25.09.2012 14:57, Roman Rybalko пишет:
Also it's not clear for me how objectID set is determined when there is search on "one" scope? The same when "sub" scope cover not the full database contents but a part of the object tree (suffix: cn=test, search base: cn=test1,cn=test)? I thought that search with base covering not the full backend suffix (but a partial tree) should work faster, but seems this is not true, even overhead introduced by additional objectID set. Right?
Howard, can you explain it please?
The scope is just another set, yes. It must be ANDed with the index sets. Searches with subtree scope over the entire database use a shortcut, instead of reading a scope set we just use a fake set representing all IDs. Searches over any other scope cannot take any shortcut.
As discussed with Howard last month, OneLevel scope can define a set of candidate IDs too, as soon as you can extract the children of the base entry from the Rdn index. Not sure OpenLDAP support that atm. Howard ?
The actual answer depends on which backend you're using. There's little point in discussing it when you can just read the source code to get the details. back-bdb uses explicit scope indices. back-hdb and back-mdb use only an rdn index.
AFAIK mdb does not use any index shortcut for "one" search scope. That makes almost impossible to view the tree in a gui tool. Gui tools often use one-scope searches. Using graphical viewing tools is frequently needed for administration stuff.
Are there any plans to implement scope=one searches optimization for mdb?
Roman Rybalko wrote:
AFAIK mdb does not use any index shortcut for "one" search scope. That makes almost impossible to view the tree in a gui tool. Gui tools often use one-scope searches. Using graphical viewing tools is frequently needed for administration stuff.
Are there any plans to implement scope=one searches optimization for mdb?
Feel free to submit a patch to the ITS.
openldap-technical@openldap.org