https://bugs.openldap.org/show_bug.cgi?id=10027
Issue ID: 10027 Summary: MDB_TXN_FULL on large write transactions Product: LMDB Version: unspecified Hardware: All OS: Linux Status: UNCONFIRMED Keywords: needs_review Severity: normal Priority: --- Component: liblmdb Assignee: bugs@openldap.org Reporter: renault.cle@gmail.com Target Milestone: ---
Hello,
Our users ([1], [2]) encountered MDB_TXN_FULL errors when our Meilisearch engine processed a large write transaction. We did read the documentation about this error in the codebase of LMDB:
Spill pages from the dirty list back to disk. This is intended to prevent running into #MDB_TXN_FULL situations, but note that they may still occur in a few cases: 1) our estimate of the txn size could be too small. Currently this seems unlikely, except with a large number of #MDB_MULTIPLE items. 2) child txns may run out of space if their parents dirtied a lot of pages and never spilled them. TODO: we probably should do a preemptive spill during #mdb_txn_begin() of a child txn, if the parent's dirty_room is below a given threshold.
Otherwise, if not using nested txns, it is expected that apps will not run into #MDB_TXN_FULL any more. The pages are flushed to disk the same way as for a txn commit, e.g. their P_DIRTY flag is cleared. If the txn never references them again, they can be left alone. If the txn only reads them, they can be used without any fuss. If the txn writes them again, they can be dirtied immediately without going thru all of the work of #mdb_page_touch(). Such references are handled by #mdb_page_unspill().
However, It looks like we are not in those scenarios, we are not using MDB_DUPFIXED, and we are not using sub-transactions. We don't use the MDB_VL32 flag either, so this is not related to [3].
Thank you for your time, Have a nice day 💡
[1]: https://github.com/meilisearch/meilisearch/issues/3603 [2]: https://github.com/meilisearch/meilisearch/issues/3349 [3]: https://bugs.openldap.org/show_bug.cgi?id=8813