openldap-devel October 2011

openldap-devel@openldap.org

1 participants
2 discussions

back-mdb status
by Howard Chu 20 Oct '11

20 Oct '11

A bit of a summary of how the backend is shaping up. I've been testing with a variety of synthetic LDIFs as well as an actual application database (Zimbra accounts). I noted before that back-mdb's write speeds on disk are quite slow. This is because a lot of its writes will be to random disk pages, and also the data writes in a transaction commit are followed by a meta page write, which always involves a seek to page 0 or page 1 of the DB file. For slapadd -q this effect can be somewhat hidden because the writes are done with MDB_NOSYNC specified, so no explicit flushes are performed. In my current tests with synchronous writes, back-mdb is one half the speed of back-bdb/hdb. (Even in fully synchronous mode, BDB only writes its transaction logs synchronously, and those are always sequential writes so there's no seek overhead to deal with.) With that said, slapadd -q for a 3.2M entry database on a tmpfs: back-hdb: real 75m32.678s user 84m31.733s sys 1m0.316s back-mdb: real 63m51.048s user 50m23.125s sys 13m27.958s For back-hdb, BDB was configured with a 32GB environment cache. The resulting DB directory consumed 14951004KB including data files and environment files. For back-mdb, MDB was configured with a 32GB mapsize. The resulting DB directory consumed 18299832KB. The input LDIF was 2.7GB, and there were 29 attributes indexed. Currently MDB is somewhat wasteful with space when dealing with the sorted-duplicate databases that are used for indexing, there's definitely room for improvement here. Also this slapadd was done with tool-threads set to 1, because back-mdb only allows one writer at a time anyway. There is also obviously room for improvement here, in terms of a bulk-loading API for the MDB library. With the DB loaded, the time to execute a search that scans every entry in the DB was performed against each server. Initially back-hdb was only configured with a cachesize of 10000 and IDLcachesize of 10000. It was tested again using a cachesize of 5,000,000 (which is more than was needed since the DB only contained 3,200,100 entries). In each configuration a search was performed twice - once to measure the time to go from an empty cache to a fully primed cache, and again to measure the time for the fully cached search. first second slapd size back-hdb, 10K cache 3m6.906s 1m39.835s 7.3GB back-hdb, 5M cache 3m12.596s 0m10.984s 46.8GB back-mdb 0m19.420s 0m16.625s 7.0GB Next, the time to execute multiple instances of this search was measured, using 2, 4, 8, and 16 ldapsearch instances running concurrently. average result time 2 4 8 16 back-hdb, 5M 0m14.147s 0m17.384s 0m45.665s 17m15.114s back-mdb 0m16.701s 0m16.688s 0m16.621 0m16.955s I don't recall doing this test against back-hdb on ada.openldap.org before, certainly the total blowup at 16 searches was unexpected. But as you can see, with no read locks in back-mdb, search performance is pretty much independent of load. At 16 threads back-mdb slowed down measurably, but that's understandable given that the rest of the system still needed CPU cycles here and there. Otherwise, slapd was running at 1600% CPU the entire time. For back-hdb, slapd maxed out at 1467% CPU, the lock overhead drove it into the ground. So far I'm pretty pleased with the results; for the most part back-mdb is delivering on what I expected. Decoding each entry every time is a bit of a slowdown, compared to having entries fully cached. But the cost disappears as soon as you get more than a couple requests running at once. Overall I believe it proves the basic philosophy - in this day and age, it's a waste of application developers' time to incorporate a caching layer into their own code. The OS already does it and does it well. Give yourself as transparent a path as possible between RAM and disk using mmap, and don't fuss with it any further. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

2 10

connection management, thread pool
by Howard Chu 09 Oct '11

09 Oct '11

It's looking like using a single mutex-controlled thread pool is a major bottleneck in the slapd frontend. Thinking it over, I've hit a number of different ideas but nothing without drawbacks. Ideally we would get rid of the distinction between listener threads and worker threads, and only have worker threads. Each thread would be responsible for a fraction of the open sockets, and service them directly instead of queueing work into a thread pool. This would essentially mimic the behavior of SLAPD_NO_THREADS, just duplicated N times. The upsides of such an approach are numerous; a whole slew of locks completely disappear from the design and we'd be keeping work local to the CPU that originally received a request. The obvious downside is that Abandon/Cancel ops would never be useful (as they currently are not useful in single-threaded slapd). I.e., since the thread responsible for a connection will always be occupied in actually processing an operation, it will never come back to read the next request on the connection (e.g. Abandon) until the current op is already finished. A possible solution to that would be to always do a quick poll in send_ldap_response() etc. to check for new requests on a connection before sending another reply. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

1 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

openldap-devel October 2011