I am trying to use LMDB to store large (huge) amounts of binary data which, for
the reason of limiting memory footprint, are split into chunks. Each chunk ist
stored under a separate key, made up of [collectionId, chunkId], so that I can
later iterate the chunks using a LMDB cursor. Chunk size is configurable.
During my tests, I encountered a strange scenario where, after inserting some
2000 chunks consisting of 512KB each, the database size had grown to a value
that was roughly 135 times the calculated size of the data. I ran stat over the
db and saw that there were > 12000 overflow pages vs. approx. 2000 data pages.
When I reduced the chunk size to 4060 bytes, the number of overflow pages went
down to 1000, and the database size went down to the expected number (I
experimented with different sizes, this was the best result). I did not find any
documentation that would explain this behavior, or how to deal with it. Of
course it makes me worry about database bloat and the consequences. Can anyone
shed light on this?
Show replies by date