openldap 2.4.57 on 16 core OracleLinux VMs with NVME disk.
8 nodes in n-way multi master configuration, MDB backend, 50k unique DNs.
We see about 10,000 auths per minute per node.
Under heavy client load, the log shows many "deferring operation: binding" messages in the same second. slapd is using only 400% cpu (of 1600 possible).
[2021-04-13 19:15:58] connection_input: conn=150474 deferring operation: binding
When I write LDIFs to one node like delete user or remove user from group, we see spikes in authentication latency metrics (what's normally .2 - .5 second response time goes up to 15-30 seconds) across all nodes in the cluster at the same time.
What knobs can be adjusted to allow for more concurrency? It seems like writes are impacting reads.
slapd.conf: threads
default is 32, tried 64 and 128 with little improvement
slapd.conf: syncrepl
Should I increase sessionlog size?
Should I increase checkpoint ops?
How to determine optimum values?
syncprov-checkpoint 100 5
syncprov-sessionlog 100
syncprov-reloadhint TRUE
mdb
maxsize 17179869184
Indices
index objectClass eq,pres
index cn,uid,mail,mobile eq,pres,sub
index o,ou,dc,preferredLanguage eq,pres
index member,memberUid eq,pres
index uidNumber,gidNumber eq,pres
index memberOf eq
index entryUUID eq
index entryCSN eq
index uniqueMember eq
index sAMAccountName eq
ulimit
bash-4.2$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 482391
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1048576
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
n-way config
serverID 1 ldap://XXXX:12389
syncrepl rid=1
provider=ldap://XXXXXX:12389
bindmethod=simple
starttls=yes
tls_cert=/opt/slapd/conf/cert.pem
tls_cacert=/etc/pki/tls/cert.pem
tls_key=/opt/slapd/conf/key.pem
binddn="cn=replication_manager,dc=service-accounts,o=Root"
credentials="YYYYYY"
tls_reqcert=never
searchbase=""
schemachecking=on
type=refreshAndPersist
retry="60 +"
(and 7 more)
mirrormode on
Any ideas?
Thanks
-Zetan