openldap 2.4.57 on 16 core OracleLinux VMs with NVME disk.
8 nodes in n-way multi master configuration, MDB backend, 50k unique DNs.
We see about 10,000 auths per minute per node.
Under heavy client load, the log shows many "deferring operation: binding"
messages in the same second. slapd is using only 400% cpu (of 1600
possible).
[2021-04-13 19:15:58] connection_input: conn=150474 deferring operation:
binding
When I write LDIFs to one node like delete user or remove user from group,
we see spikes in authentication latency metrics (what's normally .2 - .5
second response time goes up to 15-30 seconds) across all nodes in the
cluster at the same time.
What knobs can be adjusted to allow for more concurrency? It seems like
writes are impacting reads.
*slapd.conf: threads*
default is 32, tried 64 and 128 with little improvement
*slapd.conf: syncrepl*
Should I increase sessionlog size?
Should I increase checkpoint ops?
How to determine optimum values?
syncprov-checkpoint 100 5
syncprov-sessionlog 100
syncprov-reloadhint TRUE
*mdb*
maxsize 17179869184
*Indices*
index objectClass eq,pres
index cn,uid,mail,mobile eq,pres,sub
index o,ou,dc,preferredLanguage eq,pres
index member,memberUid eq,pres
index uidNumber,gidNumber eq,pres
index memberOf eq
index entryUUID eq
index entryCSN eq
index uniqueMember eq
index sAMAccountName eq
*ulimit*
bash-4.2$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 482391
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1048576
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
*n-way config*
serverID 1 ldap://XXXX:12389
syncrepl rid=1
provider=ldap://XXXXXX:12389
bindmethod=simple
starttls=yes
tls_cert=/opt/slapd/conf/cert.pem
tls_cacert=/etc/pki/tls/cert.pem
tls_key=/opt/slapd/conf/key.pem
binddn="cn=replication_manager,dc=service-accounts,o=Root"
credentials="YYYYYY"
tls_reqcert=never
searchbase=""
schemachecking=on
type=refreshAndPersist
retry="60 +"
(and 7 more)
mirrormode on
Any ideas?
Thanks
-Zetan