I second that, actually I was going to suggest doing exactly this test.
If possible, you should try to do it somewhere else that is not the
production server, maybe a VM after restoring the database?
That would allow you to carefully execute tests without interference
from other requests coming in.
If you're in Linux, strace might give you more details of what is
happening in terms of system calls, but it will generate a loot of
messages so you should try it as last resource.
Last, there are applications that might help you understand what is
(this will require to
install and configure Prometheus first)
Hope this helps.
On 26/07/2022 05:56, Wilkinson, Hugo (IT Dept) wrote:
If you have the ability to do so and your kernel is v5+, out-of-hours
experiment with disabling system swap entirely (vm.swapiness=0 and
'swapoff -a' ) and then simulate a run of requests you'd expect to
encounter the page errors with and see if it stops happening.
Depending on what else is going on this might not be a
production-compatible strategy, but it will either rule that out as a
potential flaw/bug in 2.5 and/or give you a workaround until it's fixed.