Howard Chu wrote:
Howard Chu wrote:
The time to slapadd 5 million entries for back-hdb was 29 minutes: time ../servers/slapd/slapd -Ta -f slapd.conf.5M -l ~hyc/5Mssha.ldif -q
real 29m15.828s user 20m22.110s sys 6m56.433s
Using a shared memory BDB cache instead of mmap'd files brought this time down to 16 minutes. (For 1M entries, it was only 3 minutes.)
I recently tested Isode's M-Vault 14. For the most part there were no surprises, we were 5-6 times faster on the 1M entry DB. Load times were comparable, with OpenLDAP at around 3 minutes (as before) and Isode at around 4 minutes. While OpenLDAP delivered over 29,000 auths/sec using 8 cores, Isode delivered only 4600 auths/sec. (And there's still the interesting result that OpenLDAP delivered almost 31,000/sec using only 7 cores, leaving one core reserved for the ethernet driver.)
But Steve Kille asked "how do they perform on a database much larger than the server memory, like 50 million entries?" and that yielded an unexpected result. For OpenLDAP 2.4.7, BDB 4.6.21, and back-hdb it took 11:27:40(hh:mm:ss) to slapadd the database, while it only took Isode 3:47:33 to bulk load. Naturally I was curious about why there was such a big difference. Unfortunately the Isode numbers were not repeatable; I was using XFS and the filesystem kept hanging on my subsequent rerun attempts.
I then recreated the filesystem using EXT2fs instead. For this load, Isode took only 3:18:29 while OpenLDAP took 6:59:25. I was astonished at how much XFS was costing us, but that still didn't explain the discrepancy. After all, this is a DB that's 50x larger but a runtime that's 220x slower than the 1M entry case.
Finally I noticed that there were large periods of time during the slapadd when the CPU was 100% busy but no entries were being added, and traced this down to BerkeleyDB's env_alloc_free function. So, the issue Jong raised in ITS#3851 still hasn't been completely addressed in BDB 4.6. If you're working with data sets much larger than your BDB cache, performance will plummet after the cache fills and starts needing to dump and replace pages.
I found the final proof of this conclusion by tweaking slapadd to use DB_PRIVATE when creating the environment. This option just uses malloc for the BDB caches, instead of shared memory. I also ran with tcmalloc. This time the slapadd took only 3:04:21. It would probably be worthwhile to default to DB_PRIVATE when using the -q option. Since slapadd is not extremely heavily threaded, even the default system malloc() will probably work fine. (I'll retest that as well.) One downside to this approach - when BDB creates its shared memory caches, it allocates exactly the amount of memory you specify, which is good. But when using malloc, it tracks how many bytes it requested, but can't account for any overhead that malloc itself requires to track its allocations. As such, I had to decrease my 12GB BDB cache to only 10GB in order for the slapadd to complete successfully (and it was using over 14GB of RAM out of the 16GB on the box when it completed).
It would also be worthwhile to revisit Jong's patch in ITS#3851...
(The 50M entry DB occupied 69GB on disk. The last time I tested a DB of this size was on an SGI Altix with 480GB of RAM and 32 CPUs available. Testing it on a machine with only 16GB of RAM was not a lot of fun, it turns into mainly a test of disk speed. OpenLDAP delivered only 160 auths/second on XFS, and 200 auths/second on EXT2FS. Isode delivered 8 auths/sec on EXT2FS, and I never got a test result for XFS.)