Full_Name: Leonid Yuriev Version: 2.4.40 OS: RHEL7 URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (31.130.36.33)
Currently there is flaw that does not allow using OpenLDAP + LMDB in projects with high rate of updates (add/modify/delete). The root of these problems is that LMDB cannot reclaim freed pages by a presence of a "laggard reader", or in other words if they are still referenced by an active read.
It should be noted, that withholding of reclaiming while the high update rate, burns free pages very quickly. Fix of the ITS#7904 significantly improves the situation, but does not solve all the problems completely.
Firstly, seemingly innocuous use of something like a "mdb_stat -efff | less" can lead to the MDB_MAP_FULL and paralyze update.
Second, ITS#7904 affects the syncrepl only partially. Approximately half of the "long read" operations occur without sending data to the network. Therefore, in many cases get MDB_MAP_FULL easily enough. This leads to a chain of problems and in some cases makes the replication impossible.
To solve these problems, I made two simple improvements.
1) OOMKiller feature just a fuse likely Linux kernel oomkiller.
In generally, in case of MDB_MAP_FULL will send the SIGKILL to a laggard reader, but not to self. On success will retry to reclaim and continue. Engaged by envflags oomkill.
2) Dreamcatcher feature really, it has caught and forced vanish our nightmares with syncrepl & MDB MAP_FULL ;)
Based on ITS#7904 fix. In generally, renew read-txt when the lag from last txn is greater than a configured threshold and the percentage of pages allocated is greater than the configured value. Engaged by dreamcatcher lag percentage.
Two patchsets will be attached soon.