openldap 2.4.57 on 16 core OracleLinux VMs with NVME disk.
8 nodes in n-way multi master configuration, MDB backend, 50k unique DNs.
We see about 10,000 auths per minute per node.

Under heavy client load, the log shows many "deferring operation: binding" messages in the same second. slapd is using only 400% cpu (of 1600 possible).

[2021-04-13 19:15:58] connection_input: conn=150474 deferring operation: binding

When I write LDIFs to one node like delete user or remove user from group, we see spikes in authentication latency metrics (what's normally .2 - .5 second response time goes up to 15-30 seconds) across all nodes in the cluster at the same time.

What knobs can be adjusted to allow for more concurrency? It seems like writes are impacting reads.

slapd.conf: threads
default is 32, tried 64 and 128 with little improvement

slapd.conf: syncrepl
Should I increase sessionlog size?
Should I increase checkpoint ops?
How to determine optimum values?

syncprov-checkpoint 100 5
syncprov-sessionlog 100
syncprov-reloadhint TRUE

maxsize 17179869184

index   objectClass                     eq,pres
index   cn,uid,mail,mobile              eq,pres,sub
index   o,ou,dc,preferredLanguage       eq,pres
index   member,memberUid                eq,pres
index   uidNumber,gidNumber             eq,pres
index   memberOf                        eq
index   entryUUID                       eq
index   entryCSN                        eq
index   uniqueMember                    eq
index   sAMAccountName                  eq

bash-4.2$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 482391
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1048576
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) unlimited
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
n-way config

serverID 1 ldap://XXXX:12389

syncrepl rid=1
 retry="60 +"

(and 7 more)
mirrormode on

Any ideas?