mdb_cursor_del
by David Barbour
The documentation for the `mdb_cursor_del` function doesn't indicate what
happens to the state of the cursor after the targeted object is deleted.
This makes it difficult to reason about, for example, the task of scanning
through the database and deleting keys or values according to some
criterion.
What does MDB_NEXT mean on a cursor after deletion? Will it return the next
element, or should I use MDB_GET_CURRENT for the next element? Or will
these operations return in error, and I should use MDB_SET_RANGE with the
deleted key?
Similarly, what happens to every other cursor pointing to the same key? I
see you have some code that touches them, but no documentation to tell the
user's what is happening.
This needs to be documented. Iterator invalidation is painful in C++ where
it is thoroughly documented. When it's reduced to guesswork or testing
without a clear indication of intended behavior, it's worse.
5 years, 11 months
RE24 testing call #2 (2.4.41), LMDB RE0.9 testing call #2 (0.9.15)
by Quanah Gibson-Mount
OpenLDAP 2.4.41 Engineering
Fixed libldap double free of request during abandon (ITS#7967)
Fixed libldap segfault in ldap_sync_initialize (ITS#8001)
Fixed libldap ldif-wrap off by one error (ITS#8003)
Fixed libldap handling of TLS in async mode (ITS#8022)
Fixed libldap null pointer dereference (ITS#8028)
Fixed slapd slapadd onetime leak with -w (ITS#8014)
Fixed slapd syncrepl delta-mmr issue with overlays and slapd.conf
(ITS#7976)
Fixed slapd syncrepl mutex for cookie state (ITS#7968)
Fixed slapd-mdb one-level search (ITS#7975)
Fixed slapd-mdb heap corruption (ITS#7965)
Fixed slapd-mdb crash after deleting in-use schema (ITS#7995)
Fixed slapd-mdb minor code cleanup (ITS#8011)
Fixed slapd-mdb to return errors when using incorrect env flags
(ITS#8016)
Fixed slapd-meta TLS initialization with ldaps URIs (ITS#8022)
Fixed slapo-collect segfault (ITS#7797)
Fixed slapo-constraint with 0 count constraint (ITS#7780,ITS#7781)
Fixed slapo-deref with empty attribute list (ITS#8027)
Fixed slapo-syncprov synprov_matchops usage of test_filter
(ITS#8013)
Fixed slapo-syncprov segfault on disconnect/abandon
(ITS#5452,ITS#8012)
Build Environment
Enhanced contrib modules build paths (ITS#7782)
Fixed contrib/autogroup internal operation identity
(ITS#8006)
Fixed contrib/passwd/sha2 compiler warning (ITS#8000)
Fixed contrib/noopsrch compiler warning (ITS#7998)
Fixed contrib/dupent compiler warnings (ITS#7997)
Contrib
Added pbkdf2 sha256 and sha512 schemes (ITS#7977)
Documentation
Added ldap_get_option(3) LDAP_FEATURE_INFO_VERSION
information (ITS#8032)
Added ldap_get_option(3) LDAP_OPT_API_INFO_VERSION
information (ITS#8032)
LMDB 0.9.15 Release Engineering
Fix txn init (ITS#7961,#7987)
Fix MDB_PREV_DUP (ITS#7955,#7671)
Fix compact of empty env (ITS#7956)
Added workaround for fdatasync bug in ext3fs
Build
Don't use -fPIC for static lib
Update .gitignore (ITS#7952,#7953)
Cleanup for "make test" (ITS#7841)
Misc. Android/Windows cleanup
Documentation
Fix MDB_APPEND doc
Clarify mdb_dbi_open doc
--
Quanah Gibson-Mount
Platform Architect
Zimbra, Inc.
--------------------
Zimbra :: the leader in open source messaging and collaboration
5 years, 11 months
Re: syncrepl consumer is slow
by Howard Chu
Howard Chu wrote:
> Emmanuel Lécharny wrote:
>> Le 03/02/15 09:41, Howard Chu a écrit :
>>> Emmanuel Lécharny wrote:
>>>> Le 03/02/15 05:11, Howard Chu a écrit :
>>>>> Another option here is simply to perform batching. Now that we have
>>>>> the TXN api exposed in the backend interface, we could just batch up
>>>>> e.g. 500 entries per txn. much like slapadd -q already does.
>>>>> Ultimately we ought to be able to get syncrepl refresh to occur at
>>>>> nearly the same speed as slapadd -q.
>>>>
>>>> Batching is ok, except that you never know how many entries you'll
>>>> going
>>>> to have, thus you will have to actually write the data after a
>>>> period of
>>>> time, even if you don't have the 500 entries.
>>>
>>> This isn't a problem - we know exactly when refresh completes, so we
>>> can finish the batch regardless of how many entries are left over.
>>
>> True for Refresh. I was thinking more specifically of updates when we
>> are connected.
>
> None of this is for Persist phase, I have only been talking about refresh.
>
>>> Testing this out with the experimental ITS#8040 patch - with lazy
>>> commit the 2.8M entries (2.5GB data) takes ~10 minutes for the refresh
>>> to pull them across. With batching 500 entries/txn+lazy commit it
>>> takes ~7 minutes, a decent improvement. It's still 2x slower than
>>> slapadd -q though, which loads the data in 3-1/2 minutes.
>>
>> Not bad at all. What makes it 2x slower, btw?
>
> Still looking into it. slapadd -q uses 2 threads, one to parse the LDIF
> and one to write to the DB. syncrepl consumer only uses 1 thread.
> Probably if we split reading from the network apart from writing to the
> DB, that would make the difference.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
5 years, 11 months
Fwd: LMDB and text encoding
by Timur Kristóf
On Mon, Feb 2, 2015 at 3:37 AM, Howard Chu <hyc(a)symas.com> wrote:
> Hallvard Breien Furuseth wrote:
>>
>> On 02/02/15 00:40, Howard Chu wrote:
>>>
>>> It looks OK to me. No one raises any concerns I'll commit it in a few
>>> hours.
>>
>>
>> Some sudden last thoughts:
>>
>> mdb_dump.c also has a check (memchr(key.mv_data, '\0', key.mv_size)
>> to exclude non-databases, which is no longer valid.
>
>
> Good point. As Timur's patch comment notes, we probably need an API call "is
> valid DB" now.
>
>> Database names with \0 in them can no longer be spelled as strings,
>> everything which gets DB names from the database must use binary blobs.
>> Including mdb_load and mdb_dump; I notice mdb_load uses
>> strdup() for the "database=" name. Come to think of it, I have no
>> idea if the dump format supports DB names with \0 in them.
>
>
> No, it doesn't. It's the BDB format, and BDB only accepted C strings.
(Just noticed that I hit "reply" instead of "reply all". Sorry. Now
reposting to the mailing list.)
I think it is an acceptable limitation of mdb_dump and mdb_load. This
is not the only thing they don't support: they also don't work with
user-defined comparison functions. Although I could think about ways
to solve it.
For example, we could add a command line option that would make
mdb_dump output db names as a string of hexadecimal numbers, and
mdb_load interpret them as such.
5 years, 11 months
LMDB and text encoding
by Timur Kristóf
Hi Everyone,
I've been talking to Howard about this and he suggested to post it to
this mailing list. There are two things that I recently noticed about
how LMDB works with various encodings and I think it's worth to
discuss.
1. Database names
mdb_dbi_open treats its name parameter as a C string. This means UTF-8 on
unixes and ANSI on Windows, which is problematic for cross-platform
applications.
My suggestion is to create a variant of this function that also
accepts a length parameter (or just use MDB_val) so that instead of
treating it as a C string, it would treat it like a series of bytes,
allowing the user to use the encoding of their choice.
2. Path names
Functions like mdb_env_open, mdb_env_get_path, mdb_env_copy and the
likes accept a char* for path names. This is fine on most unixes where
char* is an UTF-8 string, but unfortunately, these functions call the
ANSI variants of the Windows API functions, making it impossible to
use Unicode path names with them.
I think we should switch to the widechar APIs instead, but that would
also mean changing the LMDB API to accept a wchar_t* parameter on
Windows instead of char*.
What do you guys think about all this?
Best regards,
Timur Kristóf
5 years, 11 months