I am trying to use LMDB to store large (huge) amounts of binary data which, for the reason of limiting memory footprint, are split into chunks. Each chunk ist stored under a separate key, made up of [collectionId, chunkId], so that I can later iterate the chunks using a LMDB cursor. Chunk size is configurable.
During my tests, I encountered a strange scenario where, after inserting some 2000 chunks consisting of 512KB each, the database size had grown to a value that was roughly 135 times the calculated size of the data. I ran stat over the db and saw that there were > 12000 overflow pages vs. approx. 2000 data pages. When I reduced the chunk size to 4060 bytes, the number of overflow pages went down to 1000, and the database size went down to the expected number (I experimented with different sizes, this was the best result). I did not find any documentation that would explain this behavior, or how to deal with it. Of course it makes me worry about database bloat and the consequences. Can anyone shed light on this?