Submission from: (NULL) (184.108.40.206)
We built a database with LMDB and are very pleased with it in general. The one
area giving our users headaches regularly are bulk deletes. But some context
Like many others, we use a hierarchical key notation. Some simplified examples:
"user/00001", "user/00002", "user/00003", ...;
Or, let's say data by timestamp: "sensordata/32489732",
Use case like "remove all users" or "remove all sensordata entries from
to y" are usual. Those (transactional) operations call mdb_cursor_del() on each
entry. This runs fine until we have LOTS of entries to delete. At a magnitude of
100,000s of entries, two things happen: 1) the time it takes to delete gets
quite noticeable and 2) the 32 bit (MDB_VL32) will run into MDB_TXN_FULL (see
Given the key structure, those operations operate on consecutive keys. Because
of that, can we make this more efficient (and get rid MDB_TXN_FULL on the fly)?
A high number of individual deletes involves lots of page merging (and a lot of
dirty pages). If we had a specialized "range delete" this could likely be
avoided. The signature of the proposed functionality could look like this:
The number of pages dirtied will be the same, whether you batch or delete one by one.
As such, this won't solve the MDB_TXN_FULL problem. Also, as of commit
this problem should have already been removed for MDB_VL32.
int mdb_cursor_del_range(MDB_cursor *mc, MDB_val *key_start, MDB_val
unsigned int flags)
Given key_start and key_stop, this new function could decide to prune entire
leaf pages, or even entire sub branches. This should result in a significant
efficiency/performance gain. The challenge I'm seeing here is balancing the tree
Right, rebalancing would become quite a pain. A similar problem exists for
PUT_MULTIPLE which is why it just internally loops over one item at a time.
What do you think? I'm not sure if OpenLdap would profit here,
but I'm quite
certain that LMDB as a general database would. I'm also open to contribute if
you are willing to give some guidance.
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/