Hello.
I have some non-OpenLDAP-related LMDB questions and suggestions. I hope this is the correct list for them. If not, please point to a better venue.
It seems that MDB_cmp_func is a plain binary function without a context parameter. This makes it impossible to tune a comparison operation at runtime (without libffi magic, anyway). In particular, it makes it difficult to create bindings for the set_compare operation in other languages.
Funnily enough MDB_rel_func _does_ have a context pointer, even though it is not used yet. So would it be feasible to add a similarly contextful version of the comparison function? (Arguably these two could share the context pointer, it could just be mdb_dbi_set_userctx.)
Although the documentation in lmdb.h is generally good, it's a bit wishy-washy when it comes to lifetimes of data. The mdb_get documentation only says: "The memory pointed to by the returned values is owned by the database". My understanding is that the pointers to the key/data pair retrieved from the database are valid for the duration of the current transaction. After the transaction is over, the page in which the key/data pair was located may be reclaimed for other uses. If this is correct, could it be reflected in the documentation a bit more explicitly?
The current behavior is a bit inconsistent, though, in that depending on the operation, mdb_cursor_get may or may not update the `key` argument to point to the key in the database. It would be much clearer if after the call it _always_ pointed to the key in the database, even for operations (like MDB_SET_KEY) where the value of the key would be the same as the input argument.
Finally, what is the recommended way of dealing with composite keys? Suppose I want to look up a value associated to a pair of keys (say, a directory id and a filename). There are several ways of doing this:
* Encode the pair (key1, key2) into a single key, use that normally.
* Use MDB_DUPSORT, encode each (key2, value) as data for a duplicate key key1. Make sure the encoding is such that the data are sorted by key2. Then use MDB_GET_BOTH_RANGE to look up the nearest entry for a given (key1, key2) pair and check it actually has the correct key2.
* Create a named database for each key1, store its name as the data for key1. Store normal (key2, value) pairs in the named database,
Each of these has its problems. The first is somewhat wasteful, especially if, say, key1 is long and often repeated. The second is what I use, but it feels a bit kludgy: I'm supposed to use a key-value store yet I still have to manually decode and encode key-value pairs to and from their constituents. The third has lots of overhead if there are very few entries for a given key1, and it requires us to manually manage the lifetimes of the named databases.
My understanding is that MDB_DUPSORT internally does something close to the third option (but with optimizations to avoid wasting pages for just a few entries), but just using empty data items in the key-specific subdatabase. Would it then be feasible to add direct support for composite keys by allowing non-empty data there?
Thanks,
Lauri Alanko la@iki.fi