The documentation for the `mdb_cursor_del` function doesn't indicate what
happens to the state of the cursor after the targeted object is deleted.
This makes it difficult to reason about, for example, the task of scanning
through the database and deleting keys or values according to some
What does MDB_NEXT mean on a cursor after deletion? Will it return the next
element, or should I use MDB_GET_CURRENT for the next element? Or will
these operations return in error, and I should use MDB_SET_RANGE with the
Similarly, what happens to every other cursor pointing to the same key? I
see you have some code that touches them, but no documentation to tell the
user's what is happening.
This needs to be documented. Iterator invalidation is painful in C++ where
it is thoroughly documented. When it's reduced to guesswork or testing
without a clear indication of intended behavior, it's worse.
Howard Chu wrote:
> Emmanuel Lécharny wrote:
>> Le 03/02/15 09:41, Howard Chu a écrit :
>>> Emmanuel Lécharny wrote:
>>>> Le 03/02/15 05:11, Howard Chu a écrit :
>>>>> Another option here is simply to perform batching. Now that we have
>>>>> the TXN api exposed in the backend interface, we could just batch up
>>>>> e.g. 500 entries per txn. much like slapadd -q already does.
>>>>> Ultimately we ought to be able to get syncrepl refresh to occur at
>>>>> nearly the same speed as slapadd -q.
>>>> Batching is ok, except that you never know how many entries you'll
>>>> to have, thus you will have to actually write the data after a
>>>> period of
>>>> time, even if you don't have the 500 entries.
>>> This isn't a problem - we know exactly when refresh completes, so we
>>> can finish the batch regardless of how many entries are left over.
>> True for Refresh. I was thinking more specifically of updates when we
>> are connected.
> None of this is for Persist phase, I have only been talking about refresh.
>>> Testing this out with the experimental ITS#8040 patch - with lazy
>>> commit the 2.8M entries (2.5GB data) takes ~10 minutes for the refresh
>>> to pull them across. With batching 500 entries/txn+lazy commit it
>>> takes ~7 minutes, a decent improvement. It's still 2x slower than
>>> slapadd -q though, which loads the data in 3-1/2 minutes.
>> Not bad at all. What makes it 2x slower, btw?
> Still looking into it. slapadd -q uses 2 threads, one to parse the LDIF
> and one to write to the DB. syncrepl consumer only uses 1 thread.
> Probably if we split reading from the network apart from writing to the
> DB, that would make the difference.
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
On Mon, Feb 2, 2015 at 3:37 AM, Howard Chu <hyc(a)symas.com> wrote:
> Hallvard Breien Furuseth wrote:
>> On 02/02/15 00:40, Howard Chu wrote:
>>> It looks OK to me. No one raises any concerns I'll commit it in a few
>> Some sudden last thoughts:
>> mdb_dump.c also has a check (memchr(key.mv_data, '\0', key.mv_size)
>> to exclude non-databases, which is no longer valid.
> Good point. As Timur's patch comment notes, we probably need an API call "is
> valid DB" now.
>> Database names with \0 in them can no longer be spelled as strings,
>> everything which gets DB names from the database must use binary blobs.
>> Including mdb_load and mdb_dump; I notice mdb_load uses
>> strdup() for the "database=" name. Come to think of it, I have no
>> idea if the dump format supports DB names with \0 in them.
> No, it doesn't. It's the BDB format, and BDB only accepted C strings.
(Just noticed that I hit "reply" instead of "reply all". Sorry. Now
reposting to the mailing list.)
I think it is an acceptable limitation of mdb_dump and mdb_load. This
is not the only thing they don't support: they also don't work with
user-defined comparison functions. Although I could think about ways
to solve it.
For example, we could add a command line option that would make
mdb_dump output db names as a string of hexadecimal numbers, and
mdb_load interpret them as such.
I've been talking to Howard about this and he suggested to post it to
this mailing list. There are two things that I recently noticed about
how LMDB works with various encodings and I think it's worth to
1. Database names
mdb_dbi_open treats its name parameter as a C string. This means UTF-8 on
unixes and ANSI on Windows, which is problematic for cross-platform
My suggestion is to create a variant of this function that also
accepts a length parameter (or just use MDB_val) so that instead of
treating it as a C string, it would treat it like a series of bytes,
allowing the user to use the encoding of their choice.
2. Path names
Functions like mdb_env_open, mdb_env_get_path, mdb_env_copy and the
likes accept a char* for path names. This is fine on most unixes where
char* is an UTF-8 string, but unfortunately, these functions call the
ANSI variants of the Windows API functions, making it impossible to
use Unicode path names with them.
I think we should switch to the widechar APIs instead, but that would
also mean changing the LMDB API to accept a wchar_t* parameter on
Windows instead of char*.
What do you guys think about all this?