Howard Chu wrote:
Howard Chu wrote:
> The time to slapadd 5 million entries for back-hdb was 29 minutes:
> time ../servers/slapd/slapd -Ta -f slapd.conf.5M -l ~hyc/5Mssha.ldif -q
>
> real 29m15.828s
> user 20m22.110s
> sys 6m56.433s
Using a shared memory BDB cache instead of mmap'd files brought this time down
to 16 minutes. (For 1M entries, it was only 3 minutes.)
I recently tested Isode's M-Vault 14. For the most part there were no
surprises, we were 5-6 times faster on the 1M entry DB. Load times were
comparable, with OpenLDAP at around 3 minutes (as before) and Isode at around
4 minutes. While OpenLDAP delivered over 29,000 auths/sec using 8 cores, Isode
delivered only 4600 auths/sec. (And there's still the interesting result that
OpenLDAP delivered almost 31,000/sec using only 7 cores, leaving one core
reserved for the ethernet driver.)
But Steve Kille asked "how do they perform on a database much larger than the
server memory, like 50 million entries?" and that yielded an unexpected
result. For OpenLDAP 2.4.7, BDB 4.6.21, and back-hdb it took
11:27:40(hh:mm:ss) to slapadd the database, while it only took Isode 3:47:33
to bulk load. Naturally I was curious about why there was such a big
difference. Unfortunately the Isode numbers were not repeatable; I was using
XFS and the filesystem kept hanging on my subsequent rerun attempts.
I then recreated the filesystem using EXT2fs instead. For this load, Isode
took only 3:18:29 while OpenLDAP took 6:59:25. I was astonished at how much
XFS was costing us, but that still didn't explain the discrepancy. After all,
this is a DB that's 50x larger but a runtime that's 220x slower than the 1M
entry case.
Finally I noticed that there were large periods of time during the slapadd
when the CPU was 100% busy but no entries were being added, and traced this
down to BerkeleyDB's env_alloc_free function. So, the issue Jong raised in
ITS#3851 still hasn't been completely addressed in BDB 4.6. If you're working
with data sets much larger than your BDB cache, performance will plummet after
the cache fills and starts needing to dump and replace pages.
I found the final proof of this conclusion by tweaking slapadd to use
DB_PRIVATE when creating the environment. This option just uses malloc for the
BDB caches, instead of shared memory. I also ran with tcmalloc. This time the
slapadd took only 3:04:21. It would probably be worthwhile to default to
DB_PRIVATE when using the -q option. Since slapadd is not extremely heavily
threaded, even the default system malloc() will probably work fine. (I'll
retest that as well.) One downside to this approach - when BDB creates its
shared memory caches, it allocates exactly the amount of memory you specify,
which is good. But when using malloc, it tracks how many bytes it requested,
but can't account for any overhead that malloc itself requires to track its
allocations. As such, I had to decrease my 12GB BDB cache to only 10GB in
order for the slapadd to complete successfully (and it was using over 14GB of
RAM out of the 16GB on the box when it completed).
It would also be worthwhile to revisit Jong's patch in ITS#3851...
(The 50M entry DB occupied 69GB on disk. The last time I tested a DB of this
size was on an SGI Altix with 480GB of RAM and 32 CPUs available. Testing it
on a machine with only 16GB of RAM was not a lot of fun, it turns into mainly
a test of disk speed. OpenLDAP delivered only 160 auths/second on XFS, and 200
auths/second on EXT2FS. Isode delivered 8 auths/sec on EXT2FS, and I never got
a test result for XFS.)
--
-- Howard Chu
Chief Architect, Symas Corp.
http://www.symas.com
Director, Highland Sun
http://highlandsun.com/hyc/
Chief Architect, OpenLDAP
http://www.openldap.org/project/