https://bugs.openldap.org/show_bug.cgi?id=10027
Issue ID: 10027 Summary: MDB_TXN_FULL on large write transactions Product: LMDB Version: unspecified Hardware: All OS: Linux Status: UNCONFIRMED Keywords: needs_review Severity: normal Priority: --- Component: liblmdb Assignee: bugs@openldap.org Reporter: renault.cle@gmail.com Target Milestone: ---
Hello,
Our users ([1], [2]) encountered MDB_TXN_FULL errors when our Meilisearch engine processed a large write transaction. We did read the documentation about this error in the codebase of LMDB:
Spill pages from the dirty list back to disk. This is intended to prevent running into #MDB_TXN_FULL situations, but note that they may still occur in a few cases: 1) our estimate of the txn size could be too small. Currently this seems unlikely, except with a large number of #MDB_MULTIPLE items. 2) child txns may run out of space if their parents dirtied a lot of pages and never spilled them. TODO: we probably should do a preemptive spill during #mdb_txn_begin() of a child txn, if the parent's dirty_room is below a given threshold.
Otherwise, if not using nested txns, it is expected that apps will not run into #MDB_TXN_FULL any more. The pages are flushed to disk the same way as for a txn commit, e.g. their P_DIRTY flag is cleared. If the txn never references them again, they can be left alone. If the txn only reads them, they can be used without any fuss. If the txn writes them again, they can be dirtied immediately without going thru all of the work of #mdb_page_touch(). Such references are handled by #mdb_page_unspill().
However, It looks like we are not in those scenarios, we are not using MDB_DUPFIXED, and we are not using sub-transactions. We don't use the MDB_VL32 flag either, so this is not related to [3].
Thank you for your time, Have a nice day 💡
[1]: https://github.com/meilisearch/meilisearch/issues/3603 [2]: https://github.com/meilisearch/meilisearch/issues/3349 [3]: https://bugs.openldap.org/show_bug.cgi?id=8813
https://bugs.openldap.org/show_bug.cgi?id=10027
--- Comment #1 from Howard Chu hyc@openldap.org --- Sounds like you're simply batching too much into a single write transaction. There are constraints on how many dirty pages may be held in memory at once; above that limit even spilling won't help.
https://bugs.openldap.org/show_bug.cgi?id=10027
--- Comment #2 from renault.cle@gmail.com --- Thank you for the quick reply, Howard.
You are talking about a limit on the number of dirty pages. Do you know the limit? I looked at this part of the code [1] as the `mt_dirty_room` is assigned `MDB_IDL_UM_MAX`. I tried to compute the value of it, and it seems that only 65535 pages can be dirty in a single transaction. Is that right?
And what do you mean by "in memory"? Are the dirty pages stored in RESident memory? It would mean that LMDB can allocate up to ~256 MiB (65535*4*1024 Bytes) or ~1 GiB when pages are 16 KiB.
If everything I say above is correct: - Can we increase that limit? - Can we move the dirty pages to disk? - Can we track this number? - Can you tell me how much LMDB can allocate in RES? Is there a known limit?
[1]: https://github.com/LMDB/lmdb/blob/3947014aed7ffe39a79991fa7fb5b234da47ad1a/l...
https://bugs.openldap.org/show_bug.cgi?id=10027
--- Comment #3 from Howard Chu hyc@openldap.org --- It's not as simple as that. UM_MAX is 2^17 by the way, 131072.
When the in-memory dirty list is full, we will try to spill some dirty pages to disk to make room. That works as long as the working set of the transaction can be split up. If you have a single k/v pair larger than the dirty list though, it can't be partially spilled though.
You can try increasing MDB_IDL_LOGN. It impacts a lot of LMDB's internal structures though, so overall RAM footprint will increase a lot.
https://bugs.openldap.org/show_bug.cgi?id=10027
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Keywords|needs_review | Status|UNCONFIRMED |RESOLVED Resolution|--- |WONTFIX
https://bugs.openldap.org/show_bug.cgi?id=10027
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |VERIFIED
https://bugs.openldap.org/show_bug.cgi?id=10027
--- Comment #4 from kero renault.cle@gmail.com --- Thank you for the answer,
At Meilisearch, we try to control the memory we allocate. In one data structure, we use to sort key-values before storing them in LMDB, we allocate one big buffer and don't grow it anymore. We know the maximum amount of memory we will allocate during indexation.
One question remains: Can you tell me how much LMDB can allocate in RES? Is there a known limit? Can we control it?
I understand that it will depend on the nature of the entries we store, the length of the keys, and data values. I also read about MDB_WRITE_MAP [1]. It seems that it uses fewer mallocs, but is it significant? Hard to tell, I suppose.
[1]: http://www.lmdb.tech/doc/group__mdb.html#ga32a193c6bf4d7d5c5d579e71f22e9340