On Mon, Oct 20, 2014 at 2:34 PM, Howard Chu hyc@symas.com wrote:
Luke Kenneth Casson Leighton wrote:
On Mon, Oct 20, 2014 at 1:53 PM, Howard Chu hyc@symas.com wrote:
My experience from benchmarking OpenLDAP over the years is that mutexes scale only up to a point. When you have threads grabbing the same mutex from across socket boundaries, things go into the toilet. There's no fix for this; that's the nature of inter-socket communication.
argh. ok. so... actually.... accidentally, the design where i used a single LMDB (one env) shared amongst (20 to 30) processes using db_open to create (10 or so) databases would mitigate against that... taking a quick look at mdb.c the mutex lock is done on the env not on the database...
sooo compared to the previous design there would only be a 20/30-to-1 mutex contention whereas previously there were *10 sets* of 20 or 30 to 1 mutexes all competing... and if mutexes use sockets underneath that would explain why the inter-process communication (which also used sockets) was so dreadful.
Note - I was talking about physical CPU sockets, not network sockets.
oh right, haha :) ok scratch that theory then.
or is it relatively easy to turn off the NUMA architecture?
I can probably use taskset or something similar to restrict a process to a particular set of cores. What exactly do you have in mind?
keeping the writers and readers proposed test, but onto the same 8 cores only, running:
* 1 writer 16 readers (single program, 16 threads) as a base-line, 2-to-1 contention between threads and cores * creating extra tests adding extra programs (single writers only) first 1 extra writer, then 2, then 4, then 8, then maybe even 16.
the idea is to see how the mutexes affect performance as a... (can't think of the word!!) factor(??) of the number of writers, but without the effects of NUMA to contend with.
l.