Am Samstag 22 November 2008 02:59:28 schrieb Howard Chu:
Ralf Haferkamp wrote:
Am Freitag 21 November 2008 14:21:06 schrieb Ralf Haferkamp:
I did some profiling (with valgrinds callgrind tool) to find out where all the time is spend and it revealed that 2.4 spend a significantly larger amount of systime in the pwrite() function than 2.3. Most of that seemed to come from the bdb_tool_trickle_task() that calls libdb's memp_trickle() function. Just to verify this I ran a testbuild with the trickle_task disabled(). And slapadd's performance was back to a normal level, comparable to the 2.3.43 release.
Something else came to mind. The trickle_task obviously cannot increase the overall volume of I/O, so there's no good reason for it to make things more than twice as slow. Except, that if you're getting reads mixed with writes you will lose a lot in disk seek time.
Since you said that your BDB cache size is large enough to contain the entire DB in each case, try this: preload the LDIF into the FS cache before running slapadd. (dd if=<ldif> of=/dev/null bs=1024k)
That will eliminate any seek overhead during the run.
That didn't have any effect. But as I ran the tests multiple time, I guess the LDIF has been in the FS cache already anyways.
But as OpenLDAP 2.4 linked against db-4.7.25 is almost as fast as OpenLDAP 2.3 (as I wrote in my second post on friday) I think that this is a problem in db-4.5.20 and wonder if we should just disable the trickle-task when linking against a 4.5.X libdb?
When linking against db-4.7.25 slapadd is still a little faster when the trickle-task is disabled: 15m19s (without the trickle-task) vs. 16m12s for the 500k Entries LDIF.
But if the task really helps to speedup slapadd in very large environments that difference might be acceptable.