Hi,
we got a quite strange behaviour in which a slapd server stops processing connections for some tens of seconds while a single thread is running 100% on a single CPU and all other CPU are almost idle. When the problem arise there is no significant iowait or disk I/O (and no swapping, that's disabled). Context switches just go near zero (from some tens of thousand to some hundreds). Load average is almost always under 2.
The server has 32G of RAM and 4 HT processors, is running openldap-2.4.54 in mirror mode (but no delta replication) using the mdb backend. The same behaviour was found also with 2.4.53. OpenLDAP is the only service running on it, apart SSH and some monitoring tools. Database maxsize is 25G around 17G are used.
I'm attaching a redacted configuration of the main server (the secondary one is the same, with IDs reverted for mirror mode use)
Most of the time it works just fine, processing a up to a few thousand of read query per second while having some tens of write per second. Connections are managed by HA-proxy, sending them to this server by default (used as main node). Many times these stop are short (around 10 second) and we don't lost connections, but when the problem arise and last for enough time, HAproxy switch to the second node, and we got downtimes. Staying with the secondary node we have the same behaviour.
The problem manifests itself without periodicity and looking on the number of connection before it we could not see any usage peak. We tried to strace slapd threads during the problem, and they seem blocked on a mutex waiting for the one running at 100% (in a single CPU, user time). I'm attaching a top results during one of these events.
From the behaviour I was suspecting (just a wild and uninformated guess) some indexing issue, blocking all access.
We tried to change tool-threads to 4 because I found it cited in some example as related to threads used for indexing, but the change has no effect. Re-reading last version of man-page, if I understand it correctly, it's effective only for slapadd etc.
So a first question is: there is any other configuration parameter about indexing that I can try?
Anyway I'm not sure if there is an effective indexing issue (indexes are quite basic). I was suspecting this because there are lot of writes, and there is no strace activity during the stop. I should look somewhere else?
Any suggestion on further checks or configuration changes will be more than appreciated.
Regards Simone
If I was facing this symptom, I'd capture a couple of pstack <slapd PID> outputs when the pb is occurring (and maybe in correlation with perf top -p <slapd PID> if pstacks are not enough). That should help avoiding guesses.
++Cyrille
-----Original Message----- From: Simone Piccardi [mailto:piccardi@truelite.it] Sent: Tuesday, November 3, 2020 6:41 PM To: openldap-technical@openldap.org Subject: Connections blocked for some tens of seconds while a single slapd thread running 100%
Hi,
we got a quite strange behaviour in which a slapd server stops processing connections for some tens of seconds while a single thread is running 100% on a single CPU and all other CPU are almost idle. When the problem arise there is no significant iowait or disk I/O (and no swapping, that's disabled). Context switches just go near zero (from some tens of thousand to some hundreds). Load average is almost always under 2.
The server has 32G of RAM and 4 HT processors, is running openldap-2.4.54 in mirror mode (but no delta replication) using the mdb backend. The same behaviour was found also with 2.4.53. OpenLDAP is the only service running on it, apart SSH and some monitoring tools. Database maxsize is 25G around 17G are used.
I'm attaching a redacted configuration of the main server (the secondary one is the same, with IDs reverted for mirror mode use)
Most of the time it works just fine, processing a up to a few thousand of read query per second while having some tens of write per second. Connections are managed by HA-proxy, sending them to this server by default (used as main node). Many times these stop are short (around 10 second) and we don't lost connections, but when the problem arise and last for enough time, HAproxy switch to the second node, and we got downtimes. Staying with the secondary node we have the same behaviour.
The problem manifests itself without periodicity and looking on the number of connection before it we could not see any usage peak. We tried to strace slapd threads during the problem, and they seem blocked on a mutex waiting for the one running at 100% (in a single CPU, user time). I'm attaching a top results during one of these events.
From the behaviour I was suspecting (just a wild and uninformated guess) some indexing issue, blocking all access.
We tried to change tool-threads to 4 because I found it cited in some example as related to threads used for indexing, but the change has no effect. Re-reading last version of man-page, if I understand it correctly, it's effective only for slapadd etc.
So a first question is: there is any other configuration parameter about indexing that I can try?
Anyway I'm not sure if there is an effective indexing issue (indexes are quite basic). I was suspecting this because there are lot of writes, and there is no strace activity during the stop. I should look somewhere else?
Any suggestion on further checks or configuration changes will be more than appreciated.
Regards Simone
--On Tuesday, November 3, 2020 6:41 PM +0100 Simone Piccardi piccardi@truelite.it wrote:
The problem manifests itself without periodicity and looking on the number of connection before it we could not see any usage peak. We tried to strace slapd threads during the problem, and they seem blocked on a mutex waiting for the one running at 100% (in a single CPU, user time). I'm attaching a top results during one of these events.
If you can attach to the process while this is occurring, I'd suggest obtaining a full GDB backtrace to see what the different slapd threads are doing at that time. Also, what mutex specifically is slapd waiting on?
From the behaviour I was suspecting (just a wild and uninformated guess) some indexing issue, blocking all access.
We tried to change tool-threads to 4 because I found it cited in some example as related to threads used for indexing, but the change has no effect. Re-reading last version of man-page, if I understand it correctly, it's effective only for slapadd etc.
Correct, this has setting has zero to do with a running slapd process. It only affects how many threads are used by slapadd & slapindex while doing indexing during offline operations. Additionally any value above 2 has no impact with back-mdb (it'll just be set back to 2).
So a first question is: there is any other configuration parameter about indexing that I can try?
If you really believe that this is indexing related, you should be able to tell this from the slapd logs at "stats" logging, where you would see a specific search taking a significant amount of time. However that generally does not lead to a system that's paused as searches shouldn't trigger a mutex issue like what you're describing.
Is this on RHEL7 or later? If you have both "stats" and "sync" logging enabled (the recommended setting for replicating nodes), what does the slapd log show is happening at this time?
Regards, Quanah
--
Quanah Gibson-Mount Product Architect Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: http://www.symas.com
Il 03/11/20 22:49, Quanah Gibson-Mount ha scritto:
The problem manifests itself without periodicity and looking on the number of connection before it we could not see any usage peak. We tried to strace slapd threads during the problem, and they seem blocked on a mutex waiting for the one running at 100% (in a single CPU, user time). I'm attaching a top results during one of these events.
If you can attach to the process while this is occurring, I'd suggest obtaining a full GDB backtrace to see what the different slapd threads are doing at that time. Also, what mutex specifically is slapd waiting on?
I executed gstack on the slapd pid during one of such events saving the output, they are attached, but the running slapd is stripped so they are quite obscure (at least for me).
We are trying to put in a non stripped version (compiled with CFLAGS='-g" and --enable-debug=yes) in use for a test, but that's a production machine, and it will take a while.
What I should do to find which one the mutex is? in the straces they are identified just by a number.
So a first question is: there is any other configuration parameter about indexing that I can try?
If you really believe that this is indexing related, you should be able to tell this from the slapd logs at "stats" logging, where you would see a specific search taking a significant amount of time. However that generally does not lead to a system that's paused as searches shouldn't trigger a mutex issue like what you're describing.
No, it is not that I believe that, as I said it was just a guess about something that could need full CPU for tens of seconds blocking all other operations. But from what you are saying the guess is probably plain wrong.
Is this on RHEL7 or later? If you have both "stats" and "sync" logging enabled (the recommended setting for replicating nodes), what does the slapd log show is happening at this time?
The server is running an updated version of Amazon Linux (Amazon Linux AMI 2018.03).
We enabled stats and sync to logs, and I'm attaching a redacted excerpt of them around the incident time, when I also took the gstack.txt (done at 00:39:04) and gstack2.txt (done at 00:39:20) backtraces. But during that time there is no data.
Simone
Simone Piccardi wrote:
Il 03/11/20 22:49, Quanah Gibson-Mount ha scritto:
The problem manifests itself without periodicity and looking on the number of connection before it we could not see any usage peak. We tried to strace slapd threads during the problem, and they seem blocked on a mutex waiting for the one running at 100% (in a single CPU, user time). I'm attaching a top results during one of these events.
If you can attach to the process while this is occurring, I'd suggest obtaining a full GDB backtrace to see what the different slapd threads are doing at that time. Also, what mutex specifically is slapd waiting on?
I executed gstack on the slapd pid during one of such events saving the output, they are attached, but the running slapd is stripped so they are quite obscure (at least for me).
Traces from a stripped binary are useless.
We are trying to put in a non stripped version (compiled with CFLAGS='-g" and --enable-debug=yes) in use for a test, but that's a production machine, and it will take a while.
Il 05/11/20 16:26, Howard Chu ha scritto:
Traces from a stripped binary are useless.
Using a non stripped binary in production lead to some performance problem, so it took a while to plan a deploy for enough time to take data.
I'm attaching two gstack traces taken during two different events. What I see is a thread (the second one, here the slapd pid was 31267) in epoll_wait, all the other waiting in a futex excepted one, 31280. But I non skilled enough to understand what it is doing.
We also had a much longer event later than the ones from which the attached traces are taken (lasted some minutes). During this one all the thread were straced and we don't have a gstack trace.
This time we also got some different errors in the log, and I'm attacching a redacted excerpt hoping that they may constitute a useful clue. What we found were messages like:
Nov 10 00:49:56 ldp-11 slapd[31267]: nonpresent_callback: rid=012 present UUID 258f12bd-b531-426e-8dc8-49263545db58, dn cn=905719,cn=protected,o=ourorg
They starte appear around five seconds before most of the slapd thread stopped waiting on a futex (that happened near 00:50:03). After that there were still lot of messages on the logs (but only "nonpresent_callback" ones) up to around 00:50:57; then nothing more until the activities resumed (around 00:53:06).
From the strace we saw that the second thread (MAIN_PID+1, here 31268) was ever processing epoll_wait, with some activity in the beginning, then just awakening every 2500ms doing nothing, another thread was sending (with sendto) a lot of messages to fd 3 (from their beginning they seems syslog messages) for about 50 seconds (and the log is full of "nonpresent_callback" in this time) then it also stopped in the same futex of the other threads.
The only thread (except for the epoll waiter) that never stopped, was doing just the following system calls:
00:50:05.384738 mprotect(0x7f23bb1bc000, 9613312, PROT_READ|PROT_WRITE) = 0 00:53:05.415500 msync(0x7f23e6e56000, 25769803776, MS_SYNC) = 0 00:53:06.114513 futex(0x7f29eae6e040, FUTEX_WAKE, 1) = 1
the FUTEX_WAKE was on the futex stopping all the other ones, and after that one they restarted working.
I hope this could be enough to pin down the problem.
Simone
Your gstacks are mostly useless because you probably launched them from a directory that did not allow the lib paths to be resolved and therefore instead of seeing functions names, we see "??"
++Cyrille
-----Original Message----- From: Simone Piccardi [mailto:piccardi@truelite.it] Sent: Thursday, November 5, 2020 4:17 PM To: openldap-technical@openldap.org Subject: Re: Connections blocked for some tens of seconds while a single slapd thread running 100%
Il 03/11/20 22:49, Quanah Gibson-Mount ha scritto:
The problem manifests itself without periodicity and looking on the number of connection before it we could not see any usage peak. We tried to strace slapd threads during the problem, and they seem blocked on a mutex waiting for the one running at 100% (in a single CPU, user time). I'm attaching a top results during one of these events.
If you can attach to the process while this is occurring, I'd suggest obtaining a full GDB backtrace to see what the different slapd threads are doing at that time. Also, what mutex specifically is slapd waiting on?
I executed gstack on the slapd pid during one of such events saving the output, they are attached, but the running slapd is stripped so they are quite obscure (at least for me).
We are trying to put in a non stripped version (compiled with CFLAGS='-g" and --enable-debug=yes) in use for a test, but that's a production machine, and it will take a while.
What I should do to find which one the mutex is? in the straces they are identified just by a number.
So a first question is: there is any other configuration parameter about indexing that I can try?
If you really believe that this is indexing related, you should be able to tell this from the slapd logs at "stats" logging, where you would see a specific search taking a significant amount of time. However that generally does not lead to a system that's paused as searches shouldn't trigger a mutex issue like what you're describing.
No, it is not that I believe that, as I said it was just a guess about something that could need full CPU for tens of seconds blocking all other operations. But from what you are saying the guess is probably plain wrong.
Is this on RHEL7 or later? If you have both "stats" and "sync" logging enabled (the recommended setting for replicating nodes), what does the slapd log show is happening at this time?
The server is running an updated version of Amazon Linux (Amazon Linux AMI 2018.03).
We enabled stats and sync to logs, and I'm attaching a redacted excerpt of them around the incident time, when I also took the gstack.txt (done at 00:39:04) and gstack2.txt (done at 00:39:20) backtraces. But during that time there is no data.
Simone
Simone Piccardi piccardi@truelite.it schrieb am 05.11.2020 um 16:17 in
Nachricht 5a6d778a-b75b-3027-3a88-f5507c83977b@truelite.it:
Il 03/11/20 22:49, Quanah Gibson‑Mount ha scritto:
The problem manifests itself without periodicity and looking on the number of connection before it we could not see any usage peak. We tried to strace slapd threads during the problem, and they seem blocked on a mutex waiting for the one running at 100% (in a single CPU, user time). I'm attaching a top results during one of these events.
If you can attach to the process while this is occurring, I'd suggest obtaining a full GDB backtrace to see what the different slapd threads are doing at that time. Also, what mutex specifically is slapd waiting
on?
I executed gstack on the slapd pid during one of such events saving the output, they are attached, but the running slapd is stripped so they are quite obscure (at least for me).
I think even when stripped, you could "re-attach" the symbols (given that you saved them before stripping). For some dirstributions, such symbol (debug) packages are available for install. I don't know for your package source, however.
We are trying to put in a non stripped version (compiled with CFLAGS='‑g" and ‑‑enable‑debug=yes) in use for a test, but that's a production machine, and it will take a while.
What I should do to find which one the mutex is? in the straces they are identified just by a number.
So a first question is: there is any other configuration parameter about indexing that I can try?
If you really believe that this is indexing related, you should be able to tell this from the slapd logs at "stats" logging, where you would see a specific search taking a significant amount of time. However that generally does not lead to a system that's paused as searches shouldn't trigger a mutex issue like what you're describing.
No, it is not that I believe that, as I said it was just a guess about something that could need full CPU for tens of seconds blocking all other operations. But from what you are saying the guess is probably plain wrong.
Is this on RHEL7 or later? If you have both "stats" and "sync" logging enabled (the recommended setting for replicating nodes), what does the slapd log show is happening at this time?
The server is running an updated version of Amazon Linux (Amazon Linux AMI 2018.03).
We enabled stats and sync to logs, and I'm attaching a redacted excerpt of them around the incident time, when I also took the gstack.txt (done at 00:39:04) and gstack2.txt (done at 00:39:20) backtraces. But during that time there is no data.
Simone
openldap-technical@openldap.org