On Tue, Nov 27, 2007 at 05:17:04AM -0800, Howard Chu wrote:
The -O2 build is faster from about 4 to 24 client threads. From 28 on
the nonoptimized code is faster at every load level. I was originally using
gcc 4.1.2 but I'm seeing the same result now using gcc 4.2.2. Also, slapd
is only configured with 8 worker threads in all of these tests. Strange
that whatever optimizations the compiler has generated speeds things up for
lighter load, but works against it under heavier load.
1stlevel instruction cache thrashing due to function