2011/1/4 Quanah Gibson-Mount quanah@zimbra.com:
--On Tuesday, January 04, 2011 1:43 AM +0100 Steeg Carson steeg.carson@googlemail.com wrote:
I simulate this on my database just right now:
I suggest you read:
http://www.openldap.org/lists/openldap-technical/201011/msg00146.html
to understand how indices and their slots work.
As I now understand, the entire index for one attribute (e.g. objectClass) is "split" in several indexes. They holds for each path/node (resp. DN, but not leaf) an separate index for this attribute with all "hits" for his subtree (and for onelevel too).
If I do an ldapsearch with -b "cn=ownPath,ou=root" the slapd takes the index which is bound on this node/DN?
In my DIT are 470812 entires.
The objectClass=subEngine exists 104384 times in the entire directory (ou=root). The objectClass=subEngine exists only 1 time under "cn=ownPath,ou=root",
The following test I made with BDB, because each node holds an index for subtree and one for onelevel (hdb only for onelevel)
When I search (loglevel 33):
ldapsearch -x -h localhost -wpassword -D"uid=admin,ou=root" -b"cn=ownPath,ou=root" "(ObjectClass=subEngine)"
I get in the logfile: ================================================ => key_read <= bdb_index_read 470601 candidates <= bdb_equality_candidates: id=-1, first=228, last=470828 <= bdb_filter_candidates: id=-1 first=228 last=470828 <= bdb_list_candidates: id=-1 first=228 last=470828 <= bdb_filter_candidates: id=-1 first=228 last=470828 <= bdb_list_candidates: id=-1 first=40595 last=470825 <= bdb_filter_candidates: id=-1 first=40595 last=470825 bdb_search_candidates: id=-1 first=40595 last=470825 => test_filter EQUALITY <= test_filter 5 bdb_search: 40595 does not match filter . . .
================================================. . Should the index at this level not hold only one entry:
ldapsearch -x -h localhost -wpassword -D"uid=admin,ou=root" -b"cn=ownPath,ou=root" "(ObjectClass=subEngine)" dn | grep "^dn:" | wc -l 1
thanks Steeg!
Steeg Carson wrote:
2011/1/4 Quanah Gibson-Mountquanah@zimbra.com:
--On Tuesday, January 04, 2011 1:43 AM +0100 Steeg Carson steeg.carson@googlemail.com wrote:
I simulate this on my database just right now:
I suggest you read:
http://www.openldap.org/lists/openldap-technical/201011/msg00146.html
to understand how indices and their slots work.
As I now understand, the entire index for one attribute (e.g. objectClass) is "split" in several indexes. They holds for each path/node (resp. DN, but not leaf) an separate index for this attribute with all "hits" for his subtree (and for onelevel too).
No. Only the dn2id table maintains any notion of nodes and subtrees. All indices are global to the database and have no notion of scope.
If I do an ldapsearch with -b "cn=ownPath,ou=root" the slapd takes the index which is bound on this node/DN?
In my DIT are 470812 entires.
The objectClass=subEngine exists 104384 times in the entire directory (ou=root). The objectClass=subEngine exists only 1 time under "cn=ownPath,ou=root",
By default an index slot can only maintain 65535 records before it overflows and loses precision. Once it loses precision, you tend to get results like this. If you need to accomodate larger indices you can tweak a constant in back-bdb/back-bdb.h and recompile. You'll probably also need to increase LDAP_PVT_THREAD_STACK_SIZE.
Another workaround, without recompiling, would be to sort your entries such that all of the entries of the subEngine class are loaded in contiguous order.
2011/1/6 Howard Chu hyc@symas.com:
Steeg Carson wrote:
2011/1/4 Quanah Gibson-Mountquanah@zimbra.com:
--On Tuesday, January 04, 2011 1:43 AM +0100 Steeg Carson steeg.carson@googlemail.com wrote:
I simulate this on my database just right now:
I suggest you read:
http://www.openldap.org/lists/openldap-technical/201011/msg00146.html
to understand how indices and their slots work.
As I now understand, the entire index for one attribute (e.g. objectClass) is "split" in several indexes. They holds for each path/node (resp. DN, but not leaf) an separate index for this attribute with all "hits" for his subtree (and for onelevel too).
No. Only the dn2id table maintains any notion of nodes and subtrees. All indices are global to the database and have no notion of scope.
But what does mean (from http://www.openldap.org/lists/openldap-technical/201011/msg00146.html):
"Ordinarily at each level of the tree we keep an index tallying all of the children beneath that point. In back-bdb this index is used for subtree searches and for onelevel searches."
So if I do a search, I'll get every time ALL results (ID's) from the index for the searched value. If my search uses additionally a searchbase the slapd takes all ID's and lookup in id2entry.bdb to get the DN for the ID and compare?
If I do an ldapsearch with -b "cn=ownPath,ou=root" the slapd takes the index which is bound on this node/DN?
In my DIT are 470812 entires.
The objectClass=subEngine exists 104384 times in the entire directory (ou=root). The objectClass=subEngine exists only 1 time under "cn=ownPath,ou=root",
By default an index slot can only maintain 65535 records before it overflows and loses precision. Once it loses precision, you tend to get results like this. If you need to accomodate larger indices you can tweak a constant in back-bdb/back-bdb.h and recompile. You'll probably also need to increase LDAP_PVT_THREAD_STACK_SIZE.
Another workaround, without recompiling, would be to sort your entries such that all of the entries of the subEngine class are loaded in contiguous order.
Can you recommend a good book, where I can read all such things and understand, how openldap really works? This are all very important things for design and operation.
Thanks for helping
Steeg
--On Thursday, January 06, 2011 1:08 AM +0100 Steeg Carson steeg.carson@googlemail.com wrote:
Can you recommend a good book, where I can read all such things and understand, how openldap really works? This are all very important things for design and operation.
Your specific issue isn't a commonly encountered one, so probably wouldn't be addressed in any book. However, I'd recommend any book written by Howard on OpenLDAP. Unfortunately, no such thing exists at this time (although there may someday be one).
I have written up documentation on performance tuning OpenLDAP 2.4 that covers most other areas, for my work @ Zimbra, you can find it at:
http://wiki.zimbra.com/wiki/OpenLDAP_Performance_Tuning_6.0
It's based on you using Zimbra's tools to automatically update the configuration database, but you can easily tweak what it's talking about directly.
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
Steeg Carson wrote:
2011/1/6 Howard Chuhyc@symas.com:
Steeg Carson wrote:
2011/1/4 Quanah Gibson-Mountquanah@zimbra.com:
--On Tuesday, January 04, 2011 1:43 AM +0100 Steeg Carson steeg.carson@googlemail.com wrote:
I simulate this on my database just right now:
I suggest you read:
http://www.openldap.org/lists/openldap-technical/201011/msg00146.html
to understand how indices and their slots work.
As I now understand, the entire index for one attribute (e.g. objectClass) is "split" in several indexes. They holds for each path/node (resp. DN, but not leaf) an separate index for this attribute with all "hits" for his subtree (and for onelevel too).
No. Only the dn2id table maintains any notion of nodes and subtrees. All indices are global to the database and have no notion of scope.
But what does mean (from http://www.openldap.org/lists/openldap-technical/201011/msg00146.html):
"Ordinarily at each level of the tree we keep an index tallying all of the children beneath that point. In back-bdb this index is used for subtree searches and for onelevel searches."
So if I do a search, I'll get every time ALL results (ID's) from the index for the searched value. If my search uses additionally a searchbase the slapd takes all ID's and lookup in id2entry.bdb to get the DN for the ID and compare?
The search scope provides a set of candidates, consisting of every entry within that scope of the tree. An index lookup provides a set of candidates, consisting of every entry in the DB that matches the index. The intersection of these sets forms the set of candidate entries that must be examined in depth.
Can you recommend a good book, where I can read all such things and understand, how openldap really works? This are all very important things for design and operation.
There is no book on the internals of back-bdb or hdb. The source code is there for anyone to read. If you want to learn more, there are also extensive discussions on the design approaches and tradeoffs in the archives of the openldap-devel mailing list. For the most part, these discussions have only ever been of interest or relevance to other OpenLDAP developers.
Hello Howard,
By default an index slot can only maintain 65535 records before it overflows and loses precision. Once it loses precision, you tend to get results like this. If you need to accomodate larger indices you can tweak a constant in back-bdb/back-bdb.h and recompile. You'll probably also need to increase LDAP_PVT_THREAD_STACK_SIZE.
can you please guide to do this? Which constant must be changed, which values are suitable for this constant and for LDAP_PVT_THREAD_STACK_SIZE?
If I change this values, which other impact does it have of used memory and the slapd?
Thank you for your help!
Kindly regards, Steeg
Steeg Carson wrote:
Hello Howard,
By default an index slot can only maintain 65535 records before it overflows and loses precision. Once it loses precision, you tend to get results like this. If you need to accomodate larger indices you can tweak a constant in back-bdb/back-bdb.h and recompile. You'll probably also need to increase LDAP_PVT_THREAD_STACK_SIZE.
can you please guide to do this? Which constant must be changed, which values are suitable for this constant and for LDAP_PVT_THREAD_STACK_SIZE?
If I change this values, which other impact does it have of used memory and the slapd?
The constant is BDB_IDL_LOGN in back-bdb/idl.h. Incrementing it by 1 will double the range of an index slot before it loses precision. It will also double the amount of memory used by all of the indexing functions. I think you can safely double the current value without overrunning the default thread stack size. But if you go even higher you'll probably need to increase it.
The current value for LDAP_PVT_THREAD_STACK_SIZE is (1 * 1024 * 1024 * sizeof(void *))
(4MB on a 32 bit machine, 8MB on 64 bit machine).
If you need to raise it I would suggest adding e.g. XDEF=-DLDAP_PVT_THREAD_STACK_SIZE=16777216 to your make invocation and recompiling libldap_r with this new value.
The constant is BDB_IDL_LOGN in back-bdb/idl.h. Incrementing it by 1 will double the range of an index slot before it loses precision. It will also double the amount of memory used by all of the indexing functions. I think you can safely double the current value without overrunning the default thread stack size. But if you go even higher you'll probably need to increase it.
The current value for LDAP_PVT_THREAD_STACK_SIZE is (1 * 1024 * 1024 * sizeof(void *))
(4MB on a 32 bit machine, 8MB on 64 bit machine).
If you need to raise it I would suggest adding e.g. XDEF=-DLDAP_PVT_THREAD_STACK_SIZE=16777216 to your make invocation and recompiling libldap_r with this new value.
Hello,
I'm very happy! I've say thanks to all who helped me - especially Howard and Quanah. I changed the BDB_IDL_LOGN up to 17 and now the slapd is running very fast. The time for the search now is more then 100 times faster. Additionally, my hard disk isn't stressed anymore because of using shared memory (shm_key).
Kindly regards
Steeg
openldap-technical@openldap.org