https://bugs.openldap.org/show_bug.cgi?id=9277
Issue ID: 9277 Summary: restart 3+ providers at once burns CPU forever Product: OpenLDAP Version: 2.4.50 Hardware: All OS: All Status: UNCONFIRMED Severity: normal Priority: --- Component: slapd Assignee: bugs@openldap.org Reporter: michael@stroeder.com Target Milestone: ---
I have tiny VMs configured as Æ-DIR servers, 5 providers (multi-provider replication) and 5 read-only consumers each syncing with all providers.
Restarting all consumers at once simply works, no matter how many of the providers are up.
Restarting only two providers at once also works.
But when restarting more than two providers at once all of thems seem to hang eating up CPU.
It could be the same issue like ITS#8650 / ITS#9210 but those only mention GNUTLS being affected. But all my Æ-DIR test servers run slapd built against OpenSSL (openSUSE, Debian buster, CentOS7).
https://bugs.openldap.org/show_bug.cgi?id=9277
--- Comment #1 from Howard Chu hyc@openldap.org --- If you suspect that TLS is the cause, then it should be simple to verify by trying to reproduce the issue with TLS disabled.
https://bugs.openldap.org/show_bug.cgi?id=9277
--- Comment #2 from Michael Ströder michael@stroeder.com --- FWIW with loglevel conns these messages appear in syslog with:
slapd starting daemon: added 6r listener=(nil) daemon: added 9r listener=0x564a25cd20c0 daemon: added 10r listener=0x564a25cd2180 daemon: added 11r listener=0x564a25cd2240 daemon: added 12r listener=0x564a25cd2300 daemon: added 13r listener=0x564a25cd23c0 daemon: epoll: listen=9 active_threads=0 tvp=zero daemon: epoll: listen=10 active_threads=0 tvp=zero daemon: epoll: listen=11 active_threads=0 tvp=zero daemon: epoll: listen=12 active_threads=0 tvp=zero daemon: epoll: listen=13 active_threads=0 tvp=zero daemon: activity on 1 descriptor daemon: activity on: 2020-06-15T17:32:14.014203+02:00 ae-dir-suse-p1 ae-slapd[11935]: daemon: epoll: listen=9 active_threads=0 tvp=zero daemon: epoll: listen=10 active_threads=0 tvp=zero daemon: epoll: listen=11 active_threads=0 tvp=zero daemon: epoll: listen=12 active_threads=0 tvp=zero daemon: epoll: listen=13 active_threads=0 tvp=zero daemon: activity on 1 descriptor daemon: activity on: 2020-06-15T17:32:15.028227+02:00 ae-dir-suse-p1 ae-slapd[11935]: daemon: epoll: listen=9 active_threads=0 tvp=zero daemon: epoll: listen=10 active_threads=0 tvp=zero daemon: epoll: listen=11 active_threads=0 tvp=zero daemon: epoll: listen=12 busy daemon: epoll: listen=13 active_threads=0 tvp=zero daemon: epoll: listen=9 active_threads=0 tvp=zero daemon: epoll: listen=10 active_threads=0 tvp=zero daemon: epoll: listen=11 active_threads=0 tvp=zero daemon: epoll: listen=12 busy daemon: epoll: listen=13 active_threads=0 tvp=zero daemon: epoll: listen=9 active_threads=0 tvp=zero daemon: epoll: listen=10 active_threads=0 tvp=zero daemon: epoll: listen=11 active_threads=0 tvp=zero daemon: epoll: listen=12 busy daemon: epoll: listen=13 active_threads=0 tvp=zero daemon: epoll: listen=9 active_threads=0 tvp=zero daemon: epoll: listen=10 active_threads=0 tvp=zero daemon: epoll: listen=11 active_threads=0 tvp=zero daemon: epoll: listen=12 busy daemon: epoll: listen=13 active_threads=0 tvp=zero [..endless repeating..]
https://bugs.openldap.org/show_bug.cgi?id=9277
--- Comment #3 from Michael Ströder michael@stroeder.com --- (In reply to Howard Chu from comment #1)
If you suspect that TLS is the cause, then it should be simple to verify by trying to reproduce the issue with TLS disabled.
In Æ-DIR nothing works without TLS. Unencrypted connections are blocked. Also slapd uses the server cert as client cert for replication, thus there's a larger TLS ServerHello.
At least when simply disabling TLS for syncrepl slapd hits "Confidentiality required" but stops going into a loop.
https://bugs.openldap.org/show_bug.cgi?id=9277
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Keywords| |replication
https://bugs.openldap.org/show_bug.cgi?id=9277
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|--- |2.5.0
https://bugs.openldap.org/show_bug.cgi?id=9277
Michael Ströder michael@stroeder.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED
--- Comment #4 from Michael Ströder michael@stroeder.com --- Not a problem with recent 2.4 releases anymore. Seems to be fixed.
https://bugs.openldap.org/show_bug.cgi?id=9277
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Keywords|replication | Status|RESOLVED |VERIFIED Target Milestone|2.5.0 |---
--- Comment #5 from Quanah Gibson-Mount quanah@openldap.org --- Thanks! there have definitely been a number of fixes to replication in the last few releases.