OL2.3 vs OL2.4 perf issues

List overview All Threads
Download

newer

older

slapd/acl.c: using "filter=()" in...

ber_ptrlen() should return...

Quanah Gibson-Mount

28 Jul 2009 28 Jul '09

10:36 a.m.

I've been spending some time perf testing OL 2.4 in relation to OL 2.3. Unfortunately, RE24 is noticeably slower than 2.3 was. Results of simple auth testing with slamd show:

OL 2.3: 21,745 auths/second

OL 2.4: 15,733 auths/second

So OL 2.4 is 6,000 auths/second (aka 12,000 searches/second) slower than 2.3. I.e., 27% slower.

Howard committed a patch that slightly helps some situations, and Hallvard has a rewrite of part of the lber library that I've been testing that he'll commit soon. That helps somewhat:

OL 2.4 with howard and hallvard's patches: 17,086 auths/second.

That still leaves us over 4,500 auths/second (or 9000 searches/second) slower than RE2.3. I.e., 21.5% slower. Which is quite a substantial gap.

--Quanah

Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration

Show replies by date

Emmanuel Lecharny

28 Jul 28 Jul

10:44 a.m.

Quanah Gibson-Mount wrote:

...

I've been spending some time perf testing OL 2.4 in relation to OL 2.3. Unfortunately, RE24 is noticeably slower than 2.3 was. Results of simple auth testing with slamd show:

OL 2.3: 21,745 auths/second

OL 2.4: 15,733 auths/second

So OL 2.4 is 6,000 auths/second (aka 12,000 searches/second) slower than 2.3. I.e., 27% slower.

Howard committed a patch that slightly helps some situations, and Hallvard has a rewrite of part of the lber library that I've been testing that he'll commit soon. That helps somewhat:

OL 2.4 with howard and hallvard's patches: 17,086 auths/second.

That still leaves us over 4,500 auths/second (or 9000 searches/second) slower than RE2.3. I.e., 21.5% slower. Which is quite a substantial gap.

Any info about the infra you are running this test ?

Thanks !

...

--Quanah

--

Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc

Zimbra :: the leader in open source messaging and collaboration

-- -- cordialement, regards, Emmanuel Lécharny www.iktek.com directory.apache.org

Quanah Gibson-Mount

11:21 a.m.

--On Tuesday, July 28, 2009 7:44 PM +0200 Emmanuel Lecharny elecharny@apache.org wrote:

...

...
That still leaves us over 4,500 auths/second (or 9000 searches/second) slower than RE2.3. I.e., 21.5% slower. Which is quite a substantial gap.

Any info about the infra you are running this test ?

Thanks !

8 core system.

model name : Intel(R) Xeon(R) CPU L5335 @ 2.00GHz

with 32 GB of RAM

--Quanah

Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration

Hallvard B Furuseth

2:23 p.m.

Quanah Gibson-Mount writes:

...

OL 2.4 with howard and hallvard's patches: 17,086 auths/second.

I'm tempted to backport the patches to 2.3 so we can compare with what 2.3 could have been:-)

...

That still leaves us over 4,500 auths/second (or 9000 searches/second) slower than RE2.3. I.e., 21.5% slower. Which is quite a substantial gap.

Maybe --disable-debug is a place to fetch some speed. If it makes a enough difference from loglevel 0, it might be an idea to disable _some_ loglevels and some asserts in the default build. After we've done something about that broken log system of course - IIRC today one needs to turn on massive logging to get certain error messages.

-- Hallvard

Quanah Gibson-Mount

2:29 p.m.

--On Tuesday, July 28, 2009 11:23 PM +0200 Hallvard B Furuseth h.b.furuseth@usit.uio.no wrote:

...

Quanah Gibson-Mount writes:

...
OL 2.4 with howard and hallvard's patches: 17,086 auths/second.

I'm tempted to backport the patches to 2.3 so we can compare with what 2.3 could have been:-)

...
That still leaves us over 4,500 auths/second (or 9000 searches/second) slower than RE2.3. I.e., 21.5% slower. Which is quite a substantial gap.

Maybe --disable-debug is a place to fetch some speed. If it makes a enough difference from loglevel 0, it might be an idea to disable _some_ loglevels and some asserts in the default build. After we've done something about that broken log system of course - IIRC today one needs to turn on massive logging to get certain error messages.

I'm running with loglevel none, which logs almost nothing. I do that so syslog doesn't cause a perf hit. I don't think it specifically will make much difference in the auth rate I'm getting.

--Quanah

Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration

Hallvard B Furuseth

29 Jul 29 Jul

4:38 a.m.

Quanah Gibson-Mount writes:

...

h.b.furuseth@usit.uio.no wrote:

...
Maybe --disable-debug is a place to fetch some speed. If it makes a enough difference from loglevel 0, it might be an idea to disable _some_ loglevels and some asserts in the default build. (...)

I'm running with loglevel none, which logs almost nothing. I do that so syslog doesn't cause a perf hit.

Right, but without --disable-debug that still executes the Debug() and assert() code which decides not to log anything and not to abort.

-- Hallvard

Howard Chu

28 Jul 28 Jul

2:31 p.m.

Hallvard B Furuseth wrote:

...

Quanah Gibson-Mount writes:

...
OL 2.4 with howard and hallvard's patches: 17,086 auths/second.

I'm tempted to backport the patches to 2.3 so we can compare with what 2.3 could have been:-)

Yeah, that thought crossed my mind too. These are both limited-scope patches so the backport should be pretty easy.

But right now I'd like to focus on the differences in the 2.3 and 2.4 profile results, so we can see where we lost the performance...

...

...
That still leaves us over 4,500 auths/second (or 9000 searches/second) slower than RE2.3. I.e., 21.5% slower. Which is quite a substantial gap.

Maybe --disable-debug is a place to fetch some speed. If it makes a enough difference from loglevel 0, it might be an idea to disable _some_ loglevels and some asserts in the default build. After we've done something about that broken log system of course - IIRC today one needs to turn on massive logging to get certain error messages.

Most error messages of any importance are logged at level -1, so they're always displayed if any level of logging was enabled. I don't think this particular area is really broken. Certainly disabling debug support will give a sizeable performance boost, but it will also make diagnostics impossible when anything goes wrong.

We can always tell performance-sensitive users to build and install the whole thing twice, once without debug, and run that unless they need to reproduce a problem... :P

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Hallvard B Furuseth

30 Jul 30 Jul

6:07 a.m.

Other things that might gain some speed:

Unwrap the pthread wrapper to shave some time spent in critical sections: http://folk.uio.no/hbf/OpenLDAP/unwrap-pthreads.txt

For that matter, unwrap frequently used small wrappers like ber_memalloc.

A Makefile target which builds with gcc -fprofile-generate, runs some tests, then does make clean and rebuilds with -fprofile-use. Except it didn't work for me, ld complained about __gcov_merge_add. Oh well.

Use the C99 'restrict' keyword when available, e.g. on struct berval* parameters. Functions that e.g. receive a struct berval*, uses it frequently, and modifies bv->bv_val[...], cannot be optimized well: Sometimes the members _could_ have been kept in registers, but the compiler cannot know that because as far as it knows bv_val may be pointing back into bv, so it must re-read bv after changing bv_val[...]. (Of course another way would be to rewrite such code to extract the bv members to a local variable, but that'd be more work per function.)

Howard Chu writes:

...

Hallvard B Furuseth wrote:

...
(...) it might be an idea to disable _some_ loglevels and some asserts in the default build. After we've done something about that broken log system of course - IIRC today one needs to turn on massive logging to get certain error messages.

Most error messages of any importance are logged at level -1, so they're always displayed if any level of logging was enabled. I don't think this particular area is really broken.

I'm fairly sure I've needed once in a while to turn on debug output to see why som slap tool failed, at least. Haven't paid attention to when it happens, I'll try to remember to report it next time it happens.

...

Certainly disabling debug support will give a sizeable performance boost, but it will also make diagnostics impossible when anything goes wrong.

I dunno. Personally I'd be happy to lose TRACE and maybe ARGS from the default, but then I haven't done anything like your amount of debugging. I find myself asking users for loglevel STATS output more often than TRACE - since people think it's a good idea to turn on full logging and then only show the last lines of the log, cut off after the relevant STATS log which might have been all that was needed.

-- Hallvard

Quanah Gibson-Mount

1 Aug 1 Aug

2:08 p.m.

--On July 28, 2009 10:36:08 AM -0700 Quanah Gibson-Mount quanah@zimbra.com wrote:

...

I've been spending some time perf testing OL 2.4 in relation to OL 2.3. Unfortunately, RE24 is noticeably slower than 2.3 was. Results of simple auth testing with slamd show:

OL 2.3: 21,745 auths/second

OL 2.4: 15,733 auths/second

So OL 2.4 is 6,000 auths/second (aka 12,000 searches/second) slower than 2.3. I.e., 27% slower.

Howard committed a patch that slightly helps some situations, and Hallvard has a rewrite of part of the lber library that I've been testing that he'll commit soon. That helps somewhat:

OL 2.4 with howard and hallvard's patches: 17,086 auths/second.

That still leaves us over 4,500 auths/second (or 9000 searches/second) slower than RE2.3. I.e., 21.5% slower. Which is quite a substantial gap.

Here are the numbers with --enable-debug=no.

OL 2.3: 22,356 auths/second

OL 2.4: 17,396 auths/second

So for 2.3, this is an improvement of 611 auths/second. For 2.4, this is an improvement of 1,663 auths/second. Which I find rather significant. ;)

--Quanah

Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration

Howard Chu

2:25 p.m.

Quanah Gibson-Mount wrote:

...

--On July 28, 2009 10:36:08 AM -0700 Quanah Gibson-Mount quanah@zimbra.com wrote:

...
I've been spending some time perf testing OL 2.4 in relation to OL 2.3. Unfortunately, RE24 is noticeably slower than 2.3 was. Results of simple auth testing with slamd show:

OL 2.3: 21,745 auths/second

OL 2.4: 15,733 auths/second

So OL 2.4 is 6,000 auths/second (aka 12,000 searches/second) slower than 2.3. I.e., 27% slower.

Howard committed a patch that slightly helps some situations, and Hallvard has a rewrite of part of the lber library that I've been testing that he'll commit soon. That helps somewhat:

OL 2.4 with howard and hallvard's patches: 17,086 auths/second.

That still leaves us over 4,500 auths/second (or 9000 searches/second) slower than RE2.3. I.e., 21.5% slower. Which is quite a substantial gap.

Here are the numbers with --enable-debug=no.

OL 2.3: 22,356 auths/second

OL 2.4: 17,396 auths/second

So for 2.3, this is an improvement of 611 auths/second. For 2.4, this is an improvement of 1,663 auths/second. Which I find rather significant. ;)

OK, that lends some weight to the idea that we have too many assert()s in the 2.4 code...

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Aaron Richton

3 Aug 3 Aug

9:03 a.m.

On Sat, 1 Aug 2009, Howard Chu wrote:

...

OK, that lends some weight to the idea that we have too many assert()s in the 2.4 code...

Ehhhhh. I understand this from the performance standpoint, but from the usability standpoint, I really like to die fast when something's wrong. Even in full production and under load.

Configurable for "I want to benchmark above all else," perhaps, but I would be wary of the removal route.

Hallvard B Furuseth

11:45 a.m.

Aaron Richton writes:

...

On Sat, 1 Aug 2009, Howard Chu wrote:

...
OK, that lends some weight to the idea that we have too many assert()s in the 2.4 code...

Or too many Debug()s. To test which, you could make include/ac/assert.h contain just the line "#include <assert.h>", then configure with one but not both of --disable-debug and CPPFLAGS=-DNDEBUG.

...

Ehhhhh. I understand this from the performance standpoint, but from the usability standpoint, I really like to die fast when something's wrong. Even in full production and under load.

Well, there's assert(we kept track of what's going on) and then there's assert(we obeyed our own calling conventions to this static function). And assert(the pointer we are about to follow is not NULL), which helps explain why a crash happens but isn't usually needed to cause the crash.

...

Configurable for "I want to benchmark above all else," perhaps, but I would be wary of the removal route.

I'd like something like an assume() macro and an assert() macro, where assume() could be #defined as assert or noop or, if the compiler supports it, a compiler hint:-)

-- Hallvard

Quanah Gibson-Mount

5 Aug 5 Aug

9:56 a.m.

--On August 1, 2009 2:08:18 PM -0700 Quanah Gibson-Mount quanah@zimbra.com wrote:

...

--On July 28, 2009 10:36:08 AM -0700 Quanah Gibson-Mount quanah@zimbra.com wrote:

...
I've been spending some time perf testing OL 2.4 in relation to OL 2.3. Unfortunately, RE24 is noticeably slower than 2.3 was. Results of simple auth testing with slamd show:

OL 2.3: 21,745 auths/second

OL 2.4: 15,733 auths/second

Using BDB 4.8 instead of BDB 4.5 with OL2.4 (which 2.3 does not support), the base rate increases to 16,300 auths/second. I'll see what happens with the two perf patches added into the mix with 4.8. Perhaps we can get fairly close to 2.3 that way. :P

--Quanah

Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration

5814

Age (days ago)

5822

Last active (days ago)

openldap-devel@openldap.org

12 comments

5 participants

tags (0)

participants (5)

Aaron Richton
Emmanuel Lecharny
Hallvard B Furuseth
Howard Chu
Quanah Gibson-Mount