https://bugs.openldap.org/show_bug.cgi?id=10054
Issue ID: 10054 Summary: Value size limited to 2,147,479,552 bytes Product: LMDB Version: unspecified Hardware: x86_64 OS: Linux Status: UNCONFIRMED Keywords: needs_review Severity: normal Priority: --- Component: liblmdb Assignee: bugs@openldap.org Reporter: louis@meilisearch.com Target Milestone: ---
Hello,
According to the documentation[0], a database that is not using `MDB_DUPSORT` can store values up to `0xffffffff` bytes (around 4GB).
In practice, under Linux, the actual limit is `0x7ffff000` though (2^31 - 4096, so around 2GB).
This is due to the write loop in `mdb_page_flush`. The `wsize` value determining how many bytes will be written can be as big as `4096*dp->mp_pages`[1], and the number of overflow pages grows with the size of the value put inside the DB.
The `wsize` is not split in smaller chunks in the case where there are many overflow pages to write, and as a result the call to `pwrite`[2] does not perform a full write, but only a "short" write of 2147479552 bytes (the maximum allowed on a call to `pwrite` on Linux[3]).
This would be OK if the short write condition was handled by looping and performing another `pwrite` with the rest of the data, but instead `EIO` is returned[4].
There seems to be a related, but different issue on macOS when trying to `pwrite` more the 2^31 bytes, that was already reported[5].
This issue was reported to me by a Meilisearch user because it causes their database indexing to fail[6]. I had to investigate a bit because their setup was peculiar (high number of documents in their database) and the `EIO` error code is not very descriptive of the underlying issue.
I join a C reproducer of the issue that attempts to add a 2147479553 bytes value to the DB and fails with `EIO` (decreasing `nb_items` to a smaller value such as `2107479552` does succeed)[7].
Thank you for making LMDB! Louis Dureuil.
[0]: https://github.com/LMDB/lmdb/blob/mdb.master/libraries/liblmdb/lmdb.h#LL284C... [1]: https://github.com/LMDB/lmdb/blob/mdb.master/libraries/liblmdb/mdb.c#LL3770C... [2]: https://github.com/LMDB/lmdb/blob/mdb.master/libraries/liblmdb/mdb.c#L3820 [3]: https://stackoverflow.com/questions/70368651/why-cant-linux-write-more-than-... [4]: https://github.com/LMDB/lmdb/blob/mdb.master/libraries/liblmdb/mdb.c#L3840 [5]: https://bugs.openldap.org/show_bug.cgi?id=9736 [6]: https://github.com/meilisearch/meilisearch/issues/3654 [7]: https://github.com/dureuill/lmdb_3654/tree/main