On Thu, Aug 24, 2017 at 19:30:17 -0500, Quanah Gibson-Mount wrote:
When a write operation is performed with LMDB, the freelist is scanned for available space to reuse if possible. The larger the size of the freelist, the longer amount of time it will take for the operation to complete successfully. When the database has gotten to a certain point of fragmentation (This differs based on any individual use case), it will be start taking a noticeable amount of time for those write operations to complete and the server processing the write operation does essentially come to a halt during this process. Once the write operation completes, things go back to normal. The only solution is to dump and reload the database (slapcat/slapadd or mdb_copy -c). Eventually, you will get back into the same situation and have to do this again.
[..]
This is one area where LMDB differs significantly from back-hdb/bdb. You could have back-bdb/hdb databases that endured a high rate of write operations be in effect for years w/o needing maintenance. With LMDB, you get better read & write rates, but it requires periodic reloads.
Thanks Quanah, this definitely explains the issues we saw.
So we'll have to live with periodic mdb maintenance. I think with mdb_copy -c it should be quite doable, as opposed to slapcat/slapadd which took all day.
I'll have to look for some freelist size threshold on which we can set an alert, before we get into noticeable trouble again.
Can the need for this periodic mdb maintenance be documented in the OpenLDAP admin guide?
I'll respond to the Zimbra specific remarks off-list.
Geert