I've been doing some testing using OpenLDAP with BDB on a couple of different platforms. I noticed a similar situation. When I sit in a loop doing adds, at the 65,536th added entry the process stalls for a short period of time. After a minute or two, the add succeeds. My first thought is that this is a BDB issue, so I posted this question to Oracle's BDB forum. But I have yet to receive any answer.
This situation seems to happen when I have around 43 10MB log files. During the stall, I notice many log files are being written (another 25 or so), which is a much quicker rate than was being written prior to the stall.
The stall only happens once. I added another 350,000 entries and no more stalls. I ran a few other tests. Added 65,535 entries. All is fine. As soon as the next entry is add, even if I recycle the server, I hit the condition. I even tried deleting 1,000 entries. I would then need to add 1,0001 to get to 65,536 entries in the database and then hit the delay.
I did try playing around with the number of indexes and it did seem to affect the size of the delay, but not the fact that the delay occurs.
I'm trying to understand what OpenLDAP or BDB is doing during the stall. Is their a reorganizing of tables/indexes based on a threshold of 65,536 entries? Is this a one time only event as my testing seems to show? Again, my suspicion is that it's more of a BDB issue, but thought others here may have seen this situation.
Some values from my DB_CONFIG file: set_cachesize 0 20971520 1 set_lg_regionmax 1048576 set_lg_max 10485760 set_lg_bsize 2097152 set_lk_max_locks 2000 set_lk_max_objects 2000 set_open_flags db_private
Some values from my slapd.conf: database bdb suffix "dc=myco,dc=com" rootdn "cn=Manager,dc=myco,dc=com" rootpw secret directory /usr/local/var/openldap-data index objectClass eq index cn eq,sub index departmentNumber eq index employeeNumber eq,sub index uid eq,sub index entryCSN eq index entryUUID eq
cachesize 5000 idlcachesize 5000 dncachesize 30000 cachefree 100
searchstack 8 threads 4
Thanks for any help, Mark
--On Tuesday, July 30, 2013 4:11 PM -0400 Mark Cooper markcoop@us.ibm.com wrote:
I've been doing some testing using OpenLDAP with BDB on a couple of different platforms. I noticed a similar situation. When I sit in a loop doing adds, at the 65,536th added entry the process stalls for a short period of time. After a minute or two, the add succeeds. My first thought is that this is a BDB issue, so I posted this question to Oracle's BDB forum. But I have yet to receive any answer.
Expected. You do know the significance of 65535 right? ;)
http://en.wikipedia.org/wiki/65535_%28number%29
This is indeed a BDB bit.
--Quanah
--
Quanah Gibson-Mount Lead Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
--On Tuesday, July 30, 2013 1:30 PM -0700 Quanah Gibson-Mount quanah@zimbra.com wrote:
--On Tuesday, July 30, 2013 4:11 PM -0400 Mark Cooper markcoop@us.ibm.com wrote:
I've been doing some testing using OpenLDAP with BDB on a couple of different platforms. I noticed a similar situation. When I sit in a loop doing adds, at the 65,536th added entry the process stalls for a short period of time. After a minute or two, the add succeeds. My first thought is that this is a BDB issue, so I posted this question to Oracle's BDB forum. But I have yet to receive any answer.
Expected. You do know the significance of 65535 right? ;)
http://en.wikipedia.org/wiki/65535_%28number%29
This is indeed a BDB bit.
Well, actually, OpenLDAP bit. That's where an IDL hits maxsize and collapses into a range.
--Quanah
--
Quanah Gibson-Mount Lead Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
On 07/30/2013 01:30 PM, Quanah Gibson-Mount wrote:
--On Tuesday, July 30, 2013 4:11 PM -0400 Mark Cooper markcoop@us.ibm.com wrote:
I've been doing some testing using OpenLDAP with BDB on a couple of different platforms. I noticed a similar situation. When I sit in a loop doing adds, at the 65,536th added entry the process stalls for a short period of time. After a minute or two, the add succeeds. My first thought is that this is a BDB issue, so I posted this question to Oracle's BDB forum. But I have yet to receive any answer.
Expected. You do know the significance of 65535 right? ;)
http://en.wikipedia.org/wiki/65535_%28number%29
This is indeed a BDB bit.
65536 (1 << 16) is the threshold at which index lists are turned into ranges; perhaps this has to do with what you see.
p.
Mark Cooper wrote:
I've been doing some testing using OpenLDAP with BDB on a couple of different platforms. I noticed a similar situation. When I sit in a loop doing adds, at the 65,536th added entry the process stalls for a short period of time. After a minute or two, the add succeeds. My first thought is that this is a BDB issue, so I posted this question to Oracle's BDB forum. But I have yet to receive any answer.
This is all known/expected behavior. One (or more) of your index slots hit its maxsize of 65535 elements and was collapsed into a range. This typically happens with the objectClass index first, if you're adding a bunch of objects all of the same classes.
Taking a minute or two is abnormal, but I suppose is possible if multiple indices hit the condition at the same time.
This situation seems to happen when I have around 43 10MB log files. During the stall, I notice many log files are being written (another 25 or so), which is a much quicker rate than was being written prior to the stall.
The stall only happens once. I added another 350,000 entries and no more stalls. I ran a few other tests. Added 65,535 entries. All is fine. As soon as the next entry is add, even if I recycle the server, I hit the condition. I even tried deleting 1,000 entries. I would then need to add 1,0001 to get to 65,536 entries in the database and then hit the delay.
Not sure exactly how I am supposed to respond (reply all, or just to openldap-technical@openldap.org). So, excuse me if I did it wrong doing a reply all.
I appreciate all the quick responses. I've been working on this issue for a couple of weeks, so it's great to be able to post a question and get back the answer so rapidly.
I do have some follow-up questions based on the responses: 1) Quanah wrote: "This is indeed a BDB bit." Not sure I understand what that means. Is part of the delay due to BDB? It does seem like a whole bunch (maybe 30) log files getting written to file before processing resumes.
2) Index lists get collapsed into ranges - I assume this done to make processing more efficient. Is this a one time event? As I mentioned, I did not see it happen again.
3)If I know I'm going to have greater than 64K elements, is there any way to force it to use ranges at server startup and avoid the delay? I didn't see any such option.
Quanah - Yes, I understood that 65,536 is a power of 2 and it's importance in computer processing :)
Thanks again for the answers
From: Howard Chu hyc@symas.com To: Mark Cooper/Poughkeepsie/IBM@IBMUS, openldap-technical@openldap.org, Date: 07/30/2013 05:11 PM Subject: Re: OpenLDAP (using BDB) stalls adding 65,536th entry Sent by: openldap-technical-bounces@openldap.org
Mark Cooper wrote:
I've been doing some testing using OpenLDAP with BDB on a couple of
different
platforms. I noticed a similar situation. When I sit in a loop doing
adds,
at the 65,536th added entry the process stalls for a short period of
time.
After a minute or two, the add succeeds. My first thought is that this
is a
BDB issue, so I posted this question to Oracle's BDB forum. But I have
yet to
receive any answer.
This is all known/expected behavior. One (or more) of your index slots hit its maxsize of 65535 elements and was collapsed into a range. This typically happens with the objectClass index first, if you're adding a bunch of objects all of the same classes.
Taking a minute or two is abnormal, but I suppose is possible if multiple indices hit the condition at the same time.
This situation seems to happen when I have around 43 10MB log files.
During
the stall, I notice many log files are being written (another 25 or so),
which
is a much quicker rate than was being written prior to the stall.
The stall only happens once. I added another 350,000 entries and no more stalls. I ran a few other tests. Added 65,535 entries. All is fine.
As
soon as the next entry is add, even if I recycle the server, I hit the condition. I even tried deleting 1,000 entries. I would then need to
add
1,0001 to get to 65,536 entries in the database and then hit the delay.
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
--On Tuesday, July 30, 2013 9:58 PM -0400 Mark Cooper markcoop@us.ibm.com wrote:
Not sure exactly how I am supposed to respond (reply all, or just to openldap-technical@openldap.org). So, excuse me if I did it wrong doing a reply all.
I appreciate all the quick responses. I've been working on this issue for a couple of weeks, so it's great to be able to post a question and get back the answer so rapidly.
I do have some follow-up questions based on the responses:
- Quanah wrote: "This is indeed a BDB bit." Not sure I understand what
that means. Is part of the delay due to BDB? It does seem like a whole bunch (maybe 30) log files getting written to file before processing resumes.
Actually, as I noted in my follow up, it is the OpenLDAP IDL range compression, not BDB. ;)
--Quanah
--
Quanah Gibson-Mount Lead Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
openldap-technical@openldap.org