back-mdb locking strategies

25 Sep 2009


      There are two main problems being addressed by the mdb approach - eliminating 
multiple layers of cache copying, and eliminating multiple layers of locking.
Currently we have per-entry mutexes and also BDB locks for every cached item. 
The mutex overhead is ridiculous, 40-some bytes per mutex x 10 million entries 
is a lot of wasted storage. Especially since a single slapd can only actually 
lock 2-3 items per thread - i.e., we only need on the order of 100 locks, 
ever. The sane thing to do here is continue to use the BDB approach to 
locking, but with fewer options.
By default BDB allocates a new locker for every database handle and every 
cursor that's created. This actually causes us a lot of trouble since it 
allows two cursors created by the same thread to deadlock each other. (And 
it's why we had to go to the trouble of wrapping read operations in a TXN, 
even though we don't need any of the ACID properties for reads.) For mdb we 
will want an API for which (essentially) threadID == lockerID and cursors and 
such will not have their own lockerIDs.
Actual locks will be managed in separate tables. The most common case will be 
locks indexed by entryID. For this case we can easily structure the table to 
minimize lock and cache contention. E.g., for 16 threads, use a 16x16 array of 
structures. Each element contains a vector of 16 (mutex,locker,entryID) 
tuples. The element to lock for a given entryID is ((entryID & 0xff0) >> 4). 
In the worst case of all 16 threads going after the same entry, they will all 
be waiting on a single mutex which is no worse than what we have now. For all 
16 threads going after the same slot, they will simply have to use other slots 
in the lock vector.
But in the typical case, all threads will be going after different mutexes 
residing in different processor cachelines. And, a thread that is iterating 
sequentially through the database will be re-using the same mutex/cacheline 
for 16 entries at a time.
(16 was chosen just for the purpose of this example. The actual numbers should 
be derived from the maximum number of threads allowed to access the database.)
-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

back-mdb locking strategies