openldap 2.4.57 on 16 core OracleLinux VMs with NVME disk. 8 nodes in n-way multi master configuration, MDB backend, 50k unique DNs. We see about 10,000 auths per minute per node.
Under heavy client load, the log shows many "deferring operation: binding" messages in the same second. slapd is using only 400% cpu (of 1600 possible).
[2021-04-13 19:15:58] connection_input: conn=150474 deferring operation: binding
When I write LDIFs to one node like delete user or remove user from group, we see spikes in authentication latency metrics (what's normally .2 - .5 second response time goes up to 15-30 seconds) across all nodes in the cluster at the same time.
What knobs can be adjusted to allow for more concurrency? It seems like writes are impacting reads.
*slapd.conf: threads* default is 32, tried 64 and 128 with little improvement
*slapd.conf: syncrepl* Should I increase sessionlog size? Should I increase checkpoint ops? How to determine optimum values?
syncprov-checkpoint 100 5 syncprov-sessionlog 100 syncprov-reloadhint TRUE
*mdb* maxsize 17179869184
*Indices* index objectClass eq,pres index cn,uid,mail,mobile eq,pres,sub index o,ou,dc,preferredLanguage eq,pres index member,memberUid eq,pres index uidNumber,gidNumber eq,pres index memberOf eq index entryUUID eq index entryCSN eq index uniqueMember eq index sAMAccountName eq
*ulimit* bash-4.2$ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 482391 max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 1048576 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) unlimited virtual memory (kbytes, -v) unlimited file locks (-x) unlimited *n-way config*
serverID 1 ldap://XXXX:12389
syncrepl rid=1 provider=ldap://XXXXXX:12389 bindmethod=simple starttls=yes tls_cert=/opt/slapd/conf/cert.pem tls_cacert=/etc/pki/tls/cert.pem tls_key=/opt/slapd/conf/key.pem binddn="cn=replication_manager,dc=service-accounts,o=Root" credentials="YYYYYY" tls_reqcert=never searchbase="" schemachecking=on type=refreshAndPersist retry="60 +"
(and 7 more) mirrormode on
Any ideas? Thanks -Zetan