Howard Chu wrote:
In the meantime, I'm considering an MDB environment option to support multiple threads in a single write TXN, as long as each thread is operating on separate databases. This would hopefully allow us to further distribute the load of indexing without adding too much new complexity to libmdb.
A fairly naive implementation is now in my thread2 branch on ada. With 6 toolthreads the DB now loads in 58m51s, a sizable improvement:
time LD_PRELOAD=/usr/local/lib/libtcmalloc_minimal.so ../servers/slapd/slapd -Ta -F 7170/conf0 -n 3 -l tester.ldif -q *#################### 100.00% eta none elapsed 58m51s spd 1.3 M/s Closing DB...
real 58m53.590s user 90m15.182s sys 31m16.697s
The main changes are to mutex-protect a few of the internals: all of the mdb_page_alloc() function (which manipulates the DB freelist and pulls pages off it) mt_free_pgs (which accumulates the list of pages freed in the current txn) mt_dirty_list (which accumulates the list of pages dirtied in the current txn)
The mutex on the mt_dirty_list was restricting concurrency too much, so I split the list into 256 lists, hashed off the pgno. This accounts for the bulk of the improvement.
If you have an account on ada.openldap.org I encourage you to checkout this code and think about what's suitable to merge back into master.