openldap-devel February 2015

openldap-devel@openldap.org

15 participants
15 discussions

mdb_cursor_del
by David Barbour 05 Feb '15

05 Feb '15

The documentation for the `mdb_cursor_del` function doesn't indicate what happens to the state of the cursor after the targeted object is deleted. This makes it difficult to reason about, for example, the task of scanning through the database and deleting keys or values according to some criterion. What does MDB_NEXT mean on a cursor after deletion? Will it return the next element, or should I use MDB_GET_CURRENT for the next element? Or will these operations return in error, and I should use MDB_SET_RANGE with the deleted key? Similarly, what happens to every other cursor pointing to the same key? I see you have some code that touches them, but no documentation to tell the user's what is happening. This needs to be documented. Iterator invalidation is painful in C++ where it is thoroughly documented. When it's reduced to guesswork or testing without a clear indication of intended behavior, it's worse.

2 3

RE24 testing call #2 (2.4.41), LMDB RE0.9 testing call #2 (0.9.15)
by Quanah Gibson-Mount 03 Feb '15

03 Feb '15

OpenLDAP 2.4.41 Engineering Fixed libldap double free of request during abandon (ITS#7967) Fixed libldap segfault in ldap_sync_initialize (ITS#8001) Fixed libldap ldif-wrap off by one error (ITS#8003) Fixed libldap handling of TLS in async mode (ITS#8022) Fixed libldap null pointer dereference (ITS#8028) Fixed slapd slapadd onetime leak with -w (ITS#8014) Fixed slapd syncrepl delta-mmr issue with overlays and slapd.conf (ITS#7976) Fixed slapd syncrepl mutex for cookie state (ITS#7968) Fixed slapd-mdb one-level search (ITS#7975) Fixed slapd-mdb heap corruption (ITS#7965) Fixed slapd-mdb crash after deleting in-use schema (ITS#7995) Fixed slapd-mdb minor code cleanup (ITS#8011) Fixed slapd-mdb to return errors when using incorrect env flags (ITS#8016) Fixed slapd-meta TLS initialization with ldaps URIs (ITS#8022) Fixed slapo-collect segfault (ITS#7797) Fixed slapo-constraint with 0 count constraint (ITS#7780,ITS#7781) Fixed slapo-deref with empty attribute list (ITS#8027) Fixed slapo-syncprov synprov_matchops usage of test_filter (ITS#8013) Fixed slapo-syncprov segfault on disconnect/abandon (ITS#5452,ITS#8012) Build Environment Enhanced contrib modules build paths (ITS#7782) Fixed contrib/autogroup internal operation identity (ITS#8006) Fixed contrib/passwd/sha2 compiler warning (ITS#8000) Fixed contrib/noopsrch compiler warning (ITS#7998) Fixed contrib/dupent compiler warnings (ITS#7997) Contrib Added pbkdf2 sha256 and sha512 schemes (ITS#7977) Documentation Added ldap_get_option(3) LDAP_FEATURE_INFO_VERSION information (ITS#8032) Added ldap_get_option(3) LDAP_OPT_API_INFO_VERSION information (ITS#8032) LMDB 0.9.15 Release Engineering Fix txn init (ITS#7961,#7987) Fix MDB_PREV_DUP (ITS#7955,#7671) Fix compact of empty env (ITS#7956) Added workaround for fdatasync bug in ext3fs Build Don't use -fPIC for static lib Update .gitignore (ITS#7952,#7953) Cleanup for "make test" (ITS#7841) Misc. Android/Windows cleanup Documentation Fix MDB_APPEND doc Clarify mdb_dbi_open doc -- Quanah Gibson-Mount Platform Architect Zimbra, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration

4 3

Re: syncrepl consumer is slow
by Howard Chu 03 Feb '15

03 Feb '15

Howard Chu wrote: > Emmanuel Lécharny wrote: >> Le 03/02/15 09:41, Howard Chu a écrit : >>> Emmanuel Lécharny wrote: >>>> Le 03/02/15 05:11, Howard Chu a écrit : >>>>> Another option here is simply to perform batching. Now that we have >>>>> the TXN api exposed in the backend interface, we could just batch up >>>>> e.g. 500 entries per txn. much like slapadd -q already does. >>>>> Ultimately we ought to be able to get syncrepl refresh to occur at >>>>> nearly the same speed as slapadd -q. >>>> >>>> Batching is ok, except that you never know how many entries you'll >>>> going >>>> to have, thus you will have to actually write the data after a >>>> period of >>>> time, even if you don't have the 500 entries. >>> >>> This isn't a problem - we know exactly when refresh completes, so we >>> can finish the batch regardless of how many entries are left over. >> >> True for Refresh. I was thinking more specifically of updates when we >> are connected. > > None of this is for Persist phase, I have only been talking about refresh. > >>> Testing this out with the experimental ITS#8040 patch - with lazy >>> commit the 2.8M entries (2.5GB data) takes ~10 minutes for the refresh >>> to pull them across. With batching 500 entries/txn+lazy commit it >>> takes ~7 minutes, a decent improvement. It's still 2x slower than >>> slapadd -q though, which loads the data in 3-1/2 minutes. >> >> Not bad at all. What makes it 2x slower, btw? > > Still looking into it. slapadd -q uses 2 threads, one to parse the LDIF > and one to write to the DB. syncrepl consumer only uses 1 thread. > Probably if we split reading from the network apart from writing to the > DB, that would make the difference. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

2 1

Fwd: LMDB and text encoding
by Timur Kristóf 02 Feb '15

02 Feb '15

On Mon, Feb 2, 2015 at 3:37 AM, Howard Chu <hyc(a)symas.com> wrote: > Hallvard Breien Furuseth wrote: >> >> On 02/02/15 00:40, Howard Chu wrote: >>> >>> It looks OK to me. No one raises any concerns I'll commit it in a few >>> hours. >> >> >> Some sudden last thoughts: >> >> mdb_dump.c also has a check (memchr(key.mv_data, '\0', key.mv_size) >> to exclude non-databases, which is no longer valid. > > > Good point. As Timur's patch comment notes, we probably need an API call "is > valid DB" now. > >> Database names with \0 in them can no longer be spelled as strings, >> everything which gets DB names from the database must use binary blobs. >> Including mdb_load and mdb_dump; I notice mdb_load uses >> strdup() for the "database=" name. Come to think of it, I have no >> idea if the dump format supports DB names with \0 in them. > > > No, it doesn't. It's the BDB format, and BDB only accepted C strings. (Just noticed that I hit "reply" instead of "reply all". Sorry. Now reposting to the mailing list.) I think it is an acceptable limitation of mdb_dump and mdb_load. This is not the only thing they don't support: they also don't work with user-defined comparison functions. Although I could think about ways to solve it. For example, we could add a command line option that would make mdb_dump output db names as a string of hexadecimal numbers, and mdb_load interpret them as such.

1 0

LMDB and text encoding
by Timur Kristóf 01 Feb '15

01 Feb '15

Hi Everyone, I've been talking to Howard about this and he suggested to post it to this mailing list. There are two things that I recently noticed about how LMDB works with various encodings and I think it's worth to discuss. 1. Database names mdb_dbi_open treats its name parameter as a C string. This means UTF-8 on unixes and ANSI on Windows, which is problematic for cross-platform applications. My suggestion is to create a variant of this function that also accepts a length parameter (or just use MDB_val) so that instead of treating it as a C string, it would treat it like a series of bytes, allowing the user to use the encoding of their choice. 2. Path names Functions like mdb_env_open, mdb_env_get_path, mdb_env_copy and the likes accept a char* for path names. This is fine on most unixes where char* is an UTF-8 string, but unfortunately, these functions call the ANSI variants of the Windows API functions, making it impossible to use Unicode path names with them. I think we should switch to the widechar APIs instead, but that would also mean changing the LMDB API to accept a wchar_t* parameter on Windows instead of char*. What do you guys think about all this? Best regards, Timur Kristóf

3 11

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

openldap-devel February 2015