There are two main problems being addressed by the mdb approach - eliminating
multiple layers of cache copying, and eliminating multiple layers of locking.
Currently we have per-entry mutexes and also BDB locks for every cached item.
The mutex overhead is ridiculous, 40-some bytes per mutex x 10 million entries
is a lot of wasted storage. Especially since a single slapd can only actually
lock 2-3 items per thread - i.e., we only need on the order of 100 locks,
ever. The sane thing to do here is continue to use the BDB approach to
locking, but with fewer options.
By default BDB allocates a new locker for every database handle and every
cursor that's created. This actually causes us a lot of trouble since it
allows two cursors created by the same thread to deadlock each other. (And
it's why we had to go to the trouble of wrapping read operations in a TXN,
even though we don't need any of the ACID properties for reads.) For mdb we
will want an API for which (essentially) threadID == lockerID and cursors and
such will not have their own lockerIDs.
Actual locks will be managed in separate tables. The most common case will be
locks indexed by entryID. For this case we can easily structure the table to
minimize lock and cache contention. E.g., for 16 threads, use a 16x16 array of
structures. Each element contains a vector of 16 (mutex,locker,entryID)
tuples. The element to lock for a given entryID is ((entryID & 0xff0) >> 4).
In the worst case of all 16 threads going after the same entry, they will all
be waiting on a single mutex which is no worse than what we have now. For all
16 threads going after the same slot, they will simply have to use other slots
in the lock vector.
But in the typical case, all threads will be going after different mutexes
residing in different processor cachelines. And, a thread that is iterating
sequentially through the database will be re-using the same mutex/cacheline
for 16 entries at a time.
(16 was chosen just for the purpose of this example. The actual numbers should
be derived from the maximum number of threads allowed to access the database.)
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/