Howard Chu wrote:
Howard Chu wrote:
Well, it doesn't look like this patch caused any harm for the default case. I'm only seeing about a 10% gain in throughput using two listener threads on a 16 core machine. Not earth-shattering, not bad.
Eh. 10% was on a pretty lightly loaded test. On a heavy load the advantage is only 1.2%. Hardly seems worth the trouble.
And now the really worrying news - I was originally testing on an old install of Debian Lenny with 2.6.26 kernel. I just now updated the system to Debian Squeeze running a 2.6.32 kernel and the throughput results are a solid 20% slower, with nothing else changed.
At lighter loads (16 slapd threads, 32 client threads) Lenny is up to 35% faster than Squeeze. At peak load (168 client threads) the difference is only 3%. Probably it's network limited by then.
I've re-run the same test sequence using tcmalloc but that didn't make much difference. Something else is much slower on this OS revision. I may build a 2.6.35 kernel just to see if it's a kernel or userspace problem...