Howard Chu wrote:
Howard Chu wrote:
When looking for a performance bottleneck in a system, it always helps to search in the right component.......
Tossing out the 4 old load generator machines and replacing them with two 8-core servers (and using slamd 2.0.1 instead of 2.0.0) paints quite a different picture.
http://highlandsun.com/hyc/slamd/squeeze/doublenew/jobs/optimizing_job_20100...
With the old client machines the latency went up to the 2-3msec range at peak load, with the new machines it stays under .9msec. So basically the slowdowns were due to the load generators getting overloaded, not any part of slapd getting overloaded.
The shape of the graph still looks odd with this kernel. (The column for 3 threads per client is out of whack.) But the results are so consistent I don't think there's any measuring error to blame.
Also added results using BDB 4.8.30 (previous used 4.7.25) and also using a 2.6.35 kernel.
BDB 4.8 vs 4.7 seems to be worth about a 5% gain on its own. The 2.6.35 kernel gives a slight boost as well, with search rates spiking over 67600/second, and idle CPU down to 6%. At that point, slapd is consuming around 90% of the CPU. Network interrupts also consume about 6% total, or ~75% of one core.
Hm, I may have forgotten to mention these tests are being run on an HP Proliant DL585 G5 with 4 Opteron 8354 (2.2Ghz quad core) processors. The box has 64GB of RAM; the DB has 5 million entries and slapd is using around 25-26GB of memory. (Around 4KB per entry, plus ~4GB of BDB shared memory cache.)
I re-ran the BDB 4.8 job with 29 iterations, to match a run I had done against some other directory server. (That other server took 29 iterations to satisfy the 2-consecutive-non-improving-iterations criteria, this was just to provide comparable data). That's in the "double29" results. The only thing it really demonstrates is that OpenLDAP's performance is rock-steady under load, it doesn't just peak and then deterioriate as the load gets heavier. (Which we've seen on other servers as their thread queues get overwhelmed.)
Also for comparison I've done a run against OpenDS 2.3.0 build 3, using Sun JRE 1.6. I'm very impressed with OpenDS's results; I configured the jvm with 32GB of heap and it only used 17GB but returned very good performance. Aside from allocating a few GB for the BDB cache it's basically in stock tune. (Also with access logging disabled for these runs.) I didn't try yet with entry caching enabled; in previous runs I had poor experience with entry caching. I guess I should ask on the OpenDS forums for further tuning advice.
http://highlandsun.com/hyc/slamd/squeeze/double29/ http://highlandsun.com/hyc/slamd/squeeze/opends2.3.0/
There's a lot to be said for being able to achieve good performance without needing to fret over configuring individual caches. It makes a stronger case for back-mdb, to my mind.