https://bugs.openldap.org/show_bug.cgi?id=9360
--- Comment #3 from Howard Chu hyc@openldap.org --- (In reply to spam@markandruth.co.uk from comment #2)
It's a docker container running alpine edge on a google cloud COS host with HDDs (all processes accessing db running from within the same container). From some further tests earlier in the day it seems that it is only a subset of the lua processes that have issue, or perhaps it is intermittent in some of the lua processes - I didn't quite figure this out yet. mdb_stat -r doesn't show anything strange just 5-10 readers. DB is ~500mb.
I often see this when restarting the container after it has been running for several days; more frequent restarts don't seem to show the issue quite so much leading me to think it may be some sort of issue to do with slow hdd cache access generating a race or something?? But I don't fully understand why it would be a persistent issue rather than just the first few requests having a problem.
I'm happy to try to debug this further but need a bit of guidance as to what is the best data to try to get to figure this out.
Docker has been known to cause issues, particularly due to its use of overlay filesystems. If you use external persistent storage this problem will probably go away.