Howard and all,
I made more tests and looks like problem persists. I saw some changes but only in the memory consumption in "consumer(slave)" syncrepl.
Let me try to explain better. I have a pair of provider/consumer machines where one machine will always receive all read/writes and the other is just for High Availability(HA) purposes, so it is better have the more close as possible the DBs.
I start the provider(master) and then just after start the consumer(slave). The configuration doesn't appear to have problems since I have in my configuration 2 DBs, CONTENT and INDEX, and I see consumer doing 2 searches in these DBs when started(this is ok).
After this both consumer and provider CPU usage increases so as memory allocation by slapd process. After the HEAD changes the memory consumption in consumer increases in a much more fast rate, something like 10:1. In this way to reproduce the issue I needed to reduce the dncachesize directive in consumer to 1/10 of the provider value, or from 4,000,000 to 400,000. This avoid the process to consume all memory before the issue arises.
Let me try to summarize :
1) Start provider(mater) slapd process;
2) Start consumer(slave) slapd process;
3) Monitor memory and CPU usage in both provider and consumer;
4) Make sometimes a monitor check to see the cache information;
5) Before cache is full in provider(master) I made a gdb debug to check the consumer(slave) process threads;
6) Wait until the consumer(slave) process starts to use around 200% CPU and then collect again a gdb debug;
7) Wait a little more until the provider(master) CPU usage becomes 0% and then see that consumer(slave) CPU stay stable in 200%. Collect a gdb debug.
8) Wait some more time just for more gdb debug to see if something changed.
I re-compile the HEAD with GDB symbols for debugging. In this way I created the file attached where more than once I collect the debug information from the consumer slapd(includes the syncrepl thread). Please see file attached for details.
The item 7) is the issue I think is happening. The synchronization never ends, the responsiveness from consumer(slave) to queries is very slow, CPU usage becomes fixed in 200%, and then the logic appears never be working as expected, or in the end never synchronizing.
In the end appears that syncrepl still with some issue to synchronize the DBs.
--- On Thu, 3/19/09, Howard Chu <hyc(a)symas.com> wrote:
> From: Howard Chu <hyc(a)symas.com>
> Subject: Re: slapd syncrepl consumer having permanent high CPU load
> To: rlvcosta(a)yahoo.com
> Cc: openldap-software(a)openldap.org, "John Morrissey" <jwm(a)horde.net>
> Date: Thursday, March 19, 2009, 2:04 PM
> Rodrigo Costa wrote:
> > Folks,
> > I was preparing openLDAP with GDB symbols but looks
> like the issue was
> identified and solved in HEAD. Just to identify this issue;
> was created any
> sort of ITS for verification in a new load?
> No, the further work was just associated with ITS#5860.
> > Sorry my late response but my baby daughter just born
> last week and I was
> having some work at home.
> > I will give a try in the HEAD load.
> Try RE24 now, that's the current release candidate.
> > Best Regards,
> > Rodrigo.
> > PS-> Just some link from my daughter
> > http://sites.google.com/site/lauramenina/laura_english
> > --- On Wed, 3/18/09, Howard Chu<hyc(a)symas.com>
> >> From: Howard Chu<hyc(a)symas.com>
> >> Subject: Re: slapd syncrepl consumer having
> permanent high CPU load
> >> To: "John Morrissey"<jwm(a)horde.net>
> >> Cc: openldap-software(a)openldap.org
> >> Date: Wednesday, March 18, 2009, 5:21 AM
> >> John Morrissey wrote:
> >>> After ~16h uptime, slapd with this BDB had
> >> its DN cache to ~250k
> >>> entries after it previously appeared stable at
> >> configured 20k entries,
> >>> and its entry cache had ballooned to ~480k
> >> Its RSS was about 3.6GB
> >>> at this point, with a BDB cache size of 2GB.
> >> I was finally able to reproduce this (took several
> hours of
> >> searches. Fortunately I was at a St. Pat's party
> so I didn't
> >> have to wait around, just got home in time to see
> it start
> >> going bad...). A fix is now in HEAD.
> >> (And now we'll see if Guinness is Good For Your
> Code... ;)
> >> -- -- Howard Chu
> >> CTO, Symas Corp.
> >> http://www.symas.com
> >> Director, Highland Sun
> >> Chief Architect, OpenLDAP http://www.openldap.org/project/
> -- Howard Chu
> CTO, Symas Corp.
> Director, Highland Sun
> Chief Architect, OpenLDAP http://www.openldap.org/project/
I've got slapo-dynlist working :
dn: cn=Dynamic Group,ou=Groups,dc=example,dc=com
cn: Dynamic Group
Although, I'm curious if anyone is using this to create dynamic groups
to authenticate users with apache, etc. We can talk off list if
desired, so it doesn't get extremely off topic.
I'm using openldap 2.4.15 on FreeBSD 6.4
ldapsearch with the -W switch gives me the expected output but ends with
ldapsearch in free(): error: junk pointer, too high to make sense
Abort trap: 6 (core dumped)
ldapsearch with the -w switch does not core dump
Anyone has been using slapo-autogroup?
I tried this in slapd.conf:
autogroup-attrset groupOfURLs memberURL memberUid
As soon as autogroup is configured, any search will return:
result: 53 Server is unwilling to perform
text: operation not supported within namingContext
A quick look at the code shows tthat the search method is not
implemented, but the same thing does not prevent other overlays from
Did I misconfigured it, or is it broken? I would like to be sure I did
not misunderstood how it works before starting hacking the code.
I am working on setting up OpenLDAP for a web project. I had originally
planned on using the accesslog overlay to track all access to the LDAP. I
have discovered that the use of this overlay has a tremendous impact on
I started out by auditing all operations. With this configuration I was
getting 40-50 reads/sec max. I then decided that I only needed to audit
writes. Making this change sent the reads through the roof, but the writes
were still only averaging 16-20 per second. I am in the process of retesting
the writes with no accesslog. My assumption is that the writes per second
will jump dramtically.
So I have to question what is the purpose of the accesslog overlay? Is it
really needed and if so is there are way to increase its performance. I have
to admit that I have not paid attention to the settings for the accesslog
backend. Do I need to tweak these settings just like I did for my primary
backend? If so what are the optimal settings for a write intensive database?
I am running OpenLDAP 2.3.39 (locally built) on Red Hat Enterprise Linux
4 servers with several replicas. We use delta-syncrepl to keep the
replicas in sync with the master server.
We also use nagios and monitor the contextcsn value on the replica and
alert if it gets too far out of sync with the master server.
The issue we have now experienced a few times is that if there are a LOT
of updates in the nightly batch update process, that not all of the
updates make it to the replicas but the contextcsn stays in sync, so we
see strange errors that eventually lead us to see that the replicas are
not current even though they think they are.
Is this a known issue? I haven't found a syslog entry on the server or
the replicas that makes me think it is the flag of the root cause.
I have downloaded and built the 2.3.43 release, having installed it on
one replica. That replica is just as out of date this morning as the
others -- so, if there was a solution between 2.3.39 and 2.3.43 -- it
must have been on the provider side not the consumer side.
Thanks for any insight.
Frank Swasey | http://www.uvm.edu/~fcs
Sr Systems Administrator | Always remember: You are UNIQUE,
University of Vermont | just like everyone else.
"I am not young enough to know everything." - Oscar Wilde (1854-1900)
We are using Symas OpenLDAP 188.8.131.52 on a Red Hat EL 5.2 64-bit with 8GB of
My DB_CONFIG settings are:
set_cachesize 3 0 2
The 10 million user load finished in about 18 hours. I believe the primary
issue is the fact that we have a single disk on the system. I assume that if
I were to move the database and log files off to separate disks I would get
much better performance.
On Wed, Mar 18, 2009 at 11:26 PM, Quanah Gibson-Mount <quanah(a)zimbra.com>wrote:
> --On Wednesday, March 18, 2009 1:25 PM -0400 Pete Giesin <
> pgiesin(a)hubcitymedia.com> wrote:
> I am trying to perform some benchmarking against OpenLDAP. So far I have
>> ran my tests against a 100K and 1Million entry database and I have had
>> rather decent numbers. My final set of tests were to be ran against a
>> 10Million entry database. Unfortunately, I am having difficult loading
>> the database with this many entries. I have generated 10 1Million LDIF
>> files. I am using "slapadd -c -q -v -l <file>" to import each file. The
>> first 2 files took approximately 15 minutes each to load. The remaining 8
>> files are taking progressively longer and longer. So much longer that I
>> anticipate the entire proceess to take well over 24 hours. My question is
>> is there anything I can do to increase the performance of slapadd. I
>> assume that since slapd is not running at this point that the normal
>> DB_CONFIG and slapd.conf settings do not have much affect.
> What are the settings in your DB_CONFIG file? It is absolutely critical to
> the performance of slapadd. What version of OpenLDAP are you using? What
> version of BDB? What operating system?
> Quanah Gibson-Mount
> Principal Software Engineer
> Zimbra, Inc
> Zimbra :: the leader in open source messaging and collaboration
I'm using openldap 2.4.11 on linux x86 system, with
one database bdb backend, with the following options
in slapd.conf file:
checkpoint 0 0
Is it possible to disable bdb log.00000[0..n] files
completely? I can slapcat the database, then
remove these, slapadd and reindex database, and there's
no logs, but is it possible to avoid creating them?
Or set up some limit of numer/size of them ? Let's assume
I don't need logs. I hate logs. I don't wany any logs, whatever :)
I reviewed bdb documentation, but there's no clear information
how to control this.
Additionally - when I have a "DB_CONFIG" file in /etc/ldap, and
a DB_CONFIG file in database storage dir, e.g. /var/lib/ldap, which
one is actually used? My guess'd be /var/lib/ldap/DB_CONFIG, but
I'd like to know for sure :-)
In the ldap.conf man page I can read:
never The client will not request or check any server certificate.
In this case the ldaps:// connection will be encrypted anyway? Isn't it?
I have a working 2.3 openldap server with radius.schema, modified
How can i move the all my schema to the 2.4 version?
Can sb give me a manual or tell me how can I do it!? like convert and
include or else
cos I have schema problems on 2.4