I forgot to mention: I am using MDB_RESERVE to avoid extra memcpy. Could it be that this is the cause for the extreme database bloat?
Hello,
something wrong with the question below?
I am trying to use LMDB to store large (huge) amounts of binary data which, for the reason of limiting memory footprint, are split into chunks. Each chunk ist stored under a separate key, made up of [collectionId, chunkId], so that I can later iterate the chunks using a LMDB cursor. Chunk size is configurable.
During my tests, I encountered a strange scenario where, after inserting some 2000 chunks consisting of 512KB each, the database size had grown to a value that was roughly 135 times the calculated size of the data. I ran the stat utility over the db and saw that there were > 12000 overflow pages vs. approx. 2000 data pages. When I reduced the chunk size to 4060 bytes, the number of overflow pages went down to 1000, and the database size went down to the expected number (I experimented with different sizes, this was the best result). I did not find any documentation to explain this behaviour, or how to deal with it. Of course it makes me worry about database bloat and the consequences. Can anyone shed light on this?
thanks, Christian
Christian Sell
my bad. Nothing wrong with LMDB
Christian Sell christian@gsvitec.com hat am 26. November 2015 um 01:28 geschrieben:
I forgot to mention: I am using MDB_RESERVE to avoid extra memcpy. Could it be that this is the cause for the extreme database bloat?
Hello,
something wrong with the question below?
I am trying to use LMDB to store large (huge) amounts of binary data which, for the reason of limiting memory footprint, are split into chunks. Each chunk ist stored under a separate key, made up of [collectionId, chunkId], so that I can later iterate the chunks using a LMDB cursor. Chunk size is configurable.
During my tests, I encountered a strange scenario where, after inserting some 2000 chunks consisting of 512KB each, the database size had grown to a value that was roughly 135 times the calculated size of the data. I ran the stat utility over the db and saw that there were > 12000 overflow pages vs. approx. 2000 data pages. When I reduced the chunk size to 4060 bytes, the number of overflow pages went down to 1000, and the database size went down to the expected number (I experimented with different sizes, this was the best result). I did not find any documentation to explain this behaviour, or how to deal with it. Of course it makes me worry about database bloat and the consequences. Can anyone shed light on this?
thanks, Christian
Christian Sell
Christian Sell
GS Vitec GmbH Im Ziegelhaus 6-8 D-63571 Gelnhausen
mail: christian@gsvitec.com mobil: +49 (0) 173 5384289
Tel: +49 (0) 6051 601.26-90 Fax: +49 (0) 6051 601.26-91
openldap-technical@openldap.org