Hello!
I am trying to use openLDAP to hold a small but continuously rebuilt database with a hdb backend. Basically I build a directory under a temporary node and move it into place when its ready (hence the hdb, I want to move the hole tree into place in one go). I build a new directory, move out the current to an other temporary node and move in the new one. Lastly I delete the defunct tree and start over building a new tree. In short the idea is to always (except I suppose between moving the tree out and the new in, but I don't see any solution for that) have a complete tree in place while continuously trying to have it updated.
This thing works well for a couple of hours on the machine I am running it (PIII 1 cpu, 1000MHz, 512 Mb ram, linux 2.6 kernel), but then slows down by a factor 10-20.
Why is this and what can I do to stop it? (easy to ask...)
free shows: # free total used free shared buffers cached Mem: 508104 501992 6112 0 31268 332556 -/+ buffers/cache: 138168 369936 Swap: 2104504 2904 2101600
This isn't brilliant of course but AFAIU not catastrophic either. I have about the same when it isn't slowed down. vmstat with a sampling rate of a few seconds show no swapping before or after slapd slows down.
top shows that slapd and the script populating it runs at about 2-3% each and not much cpu consumption apart from that (consistent with a system that slows down a factor 20 I guess). The script uses Net::LDAP in perl (over local socket) so no external clients are invoked.
The really puzzling bit is that if a shut down slapd and the "directory builder" - thus reclaiming memory and filedescriptors and such to the system - and then restart them I almost immediately get the same slow down. In fact, the time it takes to get the computer to slow down after firing up slapd seems proportional to how long I let it "rest".
I have tried all sorts of things to analyze this and finally decided to profile slapd. I rebuilt it with -g -pg in CFLAGS and --enable-debug to configure (actually I have used that switch all along). I also discovered that I had to replace 'strip = -s' with 'strip =' in all makefiles even though --enable-debug was given (is this intentional or a bug in configure?). Finally I had to get the gprof-helper (and confirm that it was used) by Hocevar/Jönsson to be able to profile threaded applications. The result doesn't however tell me much. The slapd process seems to spend most (70-80%) of its time in the "at_next" routine.
Info about system: I am running v2.3.24 of slapd. built with: ./configure --program-prefix=jj4 --with-threads=yes --enable-dynamic --enable-debug --enable-crypt --enable-lmpasswd --enable-spasswd --enable- modules --enable-backends=mod --enable-sql=no --enable-ldap=mod --enable-meta=mod --enable-monitor=mod --enable-null=mod --enable-perl=no --ena ble-relay=mod --enable-shell=mod --enable-overlays=mod --enable-denyop=mod --enable-dyngroup=mod --enable-dynlist=mod --enable-lastmod=mod --enable-proxycache=mod --enable-retcode=mod --enable-rwm=mod --enable-dependency-tracking
(lots of modules are built but only hbd-backend is actually loaded when I'm running)
These are the relevant and nonsensitive parts of the slapd.conf:
--- sizelimit 1000000 moduleload back_hdb.la
database hdb
suffix *removed* rootdn *removed* rootpw *removed* directory /usr/local/lis/var/db checkpoint 512 5 dirtyread dbconfig set_cachesize 0 16777216 8 dbconfig set_lg_regionmax 262144 dbconfig set_lg_bsize 2097152 dbconfig set_lg_max 16777216 dbconfig set_flags DB_LOG_AUTOREMOVE index objectClass eq ---
I use dirtyread because I want to be able to read while I'm writing (which is almost always) while an occasional bad read is acceptable (it should anyway be very rare since I don't do the modifications in the "current" tree where I read).
Thanks in advance
Johan Jönemo