LMDB crash consistency, again
by Howard Chu
This paper
https://www.usenix.org/conference/osdi14/technical-sessions/presentation/...
describes a potential crash vulnerability in LMDB due to its use of fdatasync
instead of fsync when syncing writes to the data file. The vulnerability
exists because fdatasync omits syncs of the file metadata; if the data file
needed to grow as a result of any writes then this requires a metadata update.
This is a well-understood issue in LMDB; we briefly touched on it in this
earlier email thread
http://www.openldap.org/lists/openldap-technical/201402/msg00111.html and it's
been a topic of discussion on IRC ever since the first multi-FS
microbenchmarks we conducted back in 2012. http://symas.com/mdb/microbench/july/
It's worth noting that this vulnerability doesn't exist on Windows, MacOSX,
Android, or *BSD, because none of these OSs have a function equivalent to
fdatasync in the first place - they always use fsync (or the Windows
equivalent). (Android is an oddball; the underlying Linux kernel of course
supports fdatasync, but the C library, bionic, does not.)
We have a couple approaches for Linux:
1) provide an option to preallocate the file, using fallocate().
Unfortunately this doesn't completely eliminate metadata updates - filesystem
drivers tend to try to be "smart" and make fallocate cheap; they allocate the
space in the FS metadata but they also mark it as "unseen." The first time a
process accesses an unseen page, it gets zeroed out. Up until that point,
whatever old contents of the disk page are still present. The act of marking a
page from "unseen" to "seen" requires a metadata update of its own.
We had a discussion of this FS mis-feature a while ago, but it was fruitless.
https://lkml.org/lkml/2012/12/7/396
2) preallocate the file by explicitly writing zeros to it. This has a
couple other disadvantages:
a) on SSDs, doing such a write needlessly contributes to wearout of the
flash.
b) Windows detects all-zero writes and compresses them out, creating a
sparse file, thus defeating the attempt at preallocation.
3) track the allocated size of the file, and toggle between fsync and
fdatasync depending on whether the allocated size actually grows or not. This
is the approach I'm currently taking in a development branch. Whether we add
this to a new 0.9.x release, or just in 1.0, I haven't yet decided.
As another footnote, I plan to add support for LMDB on a raw partition in 1.x.
Naturally, fsync vs fdatasync will be irrelevant in that case.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
8 years, 10 months
Re: Antw: Passwords, Hashing, and Binds
by Quanah Gibson-Mount
--On Monday, November 24, 2014 12:22 PM +0100 Onno van der Straaten
<onno.van.der.straaten(a)gmail.com> wrote:
> sudo make install
I'd generally advise you really read over the options to configure, and
build a better set of binaries. For example, leave out back-bdb/hdb, and
enable building things modularly.
My options are:
--with-cyrus-sasl \
--with-tls=openssl \
--enable-dynamic \
--enable-slapd \
--enable-modules \
--enable-backends=mod \
--disable-shell \
--disable-sql \
--disable-bdb \
--disable-hdb \
--disable-ndb \
--enable-overlays=mod \
--enable-debug \
--enable-spasswd \
--enable-crypt; \
> Make the sha2 module
> cd ~/openldap/contrib/slapd-modules/passwd/sha2
> sed -i.bak s/-Wall -g/-Wall -g fPIC/g Makefile
> make
I do:
(cd openldap-$(LDAP_VERSION)/contrib/slapd-modules/passwd/sha2; \
$(MAKE) prefix=/usr/local LIBS="-L$(LDAP_LIB_DIR) -lldap_r -llber"
install STRIP=""; \
)
And then it installs it for me in the same location. Just make sure you
use the same prefix here.
> This results in a number of files pw-sha2.la sha2.lo sha2.o
slapd-sha2.lo slapd-sha2.o
>
> The question now is how to install this on my target OpenLDAP server. I
> put the files in /usr/lib64/openldap en dan tried to add the following
> dn: cn=module{0},cn=config
> changetype: modify
> replace: olcModuleLoad
> olcModuleLoad: slapd-sha2.la
I'm not sure that replacing olcModuleLoad is correct. If you already have
values in there, you probably want to keep them. I generally *add* an
additional values. In any case, your value for the attribute is incorrect.
The .la file is named, as in your email, pw-sha2.la, not slapd-sha2.la .
If you want to add it as an additional module to load, then you would do
changetype: modify
add: olcModuleLoad
olcModuleLoad: pw-sha2.la
My loaded modules are:
dn: cn=module{0}
objectClass: olcModuleList
cn: module{0}
olcModulePath: /opt/zimbra/openldap/sbin/openldap
olcModuleLoad: {0}back_mdb.la
olcModuleLoad: {1}back_monitor.la
olcModuleLoad: {2}syncprov.la
olcModuleLoad: {3}accesslog.la
olcModuleLoad: {4}dynlist.la
olcModuleLoad: {5}unique.la
olcModuleLoad: {6}noopsrch.la
olcModuleLoad: {7}pw-sha2.la
for example.
now, if you want to make something like say, SHA512 the default, then you
need to modify the frontend config db:
dn: olcDatabase={-1},cn=config
changetype: modify
replace: olcPasswordHash
olcPasswordHash: {SSHA512}
--Quanah
--
Quanah Gibson-Mount
Server Architect
Zimbra, Inc.
--------------------
Zimbra :: the leader in open source messaging and collaboration
9 years
Re: [LMDB] Lockups with robust mutexes and crashing processes
by Marcos-David Dione
Marcos-David Dione/NCE/AMADEUS wrote on 24/11/2014 11:10:42:
> Seen like that I'm not sure if there's a defined behaviour
> for that. I'll ask in the glibc and/or kernel MLs and I'll come
> back with the answer.
and here's the answer:
> On 11/24/2014 03:34 PM, Marcos Dione wrote:
> > We found a situation where a robust mutex cannot be recovered
> > from a stale lock and we're wondering if it's simply an undefined
> > situation or a bug in the kernel. Attached you will find the sample
> > code, which is loosely based on a glibc's test case.The gist of it is
as
> > follows:
> >
> > 1. we open a file.
> > 2. we mmap it and use that mem to store a robust mutex.
> > 3. we lock the mutex.
> > 4. we munmap the file.
> > 5. we close the file.
>
> Undefined behaviour.
>
> This results in undefined behaviour since the allocated storage for
> the mutex object has been lost. You need to keep that storage around
> for the robust algorithms to work with. Without any data you can't
> do anything.
Full answer:
https://sourceware.org/ml/libc-help/2014-11/msg00035.html
--
Marcos Dione
Astek Sud-Est
R&D-SSP-DTA-TAE-TDS
for Amadeus SAS
T: +33 (4)4 9704 1727
marcos-david.dione(a)amadeus.com
9 years
[LMDB] Lockups with robust mutexes and crashing processes
by Marcos-David Dione
I already posted this to the IRC channel, but there was no
response, so I repost this here.
I'm trying out lmdb from master, including the robust mutex code.
We're experiencing lock ups after the process holding the lock dies, as if
the robust lock was not recovered. I tried to come up with an lmdb example
that shows it and I got it, just a few lines. It uses fork() just to
automate it; see that the environment is opened in both children. Here's
the code:
http://pastebin.com/Cbbri6az
If I run this, I see that one of the children waits for the write
lock and is not awakened when the other child dies without closing the txn
(but notice I close the env). This is on purpose, to simulate a crashing
process.The worst part is that I can't reproduce it using directly
libpthread and mmap. Here is the code I came up with:
http://pastebin.com/ybR5L4cP
It's a little bit more verbose because I based it on a glibc test
case.
Are we missing anything? It seems to us that the code follows does
not break any of LMDB's caveats (specially the one about creating the envs
before fork()'ing. Is it wrong to assume that the waiting process should
recover the lock from staleness?
--
Marcos Dione
Astek Sud-Est
R&D-SSP-DTA-TAE-TDS
for Amadeus SAS
T: +33 (4)4 9704 1727
marcos-david.dione(a)amadeus.com
9 years
Accessing entry attributes from pwdcheckmodule
by Daly Chikhaoui
Hello
I need to access, from the pwdcheckmodule script, to the current entry (the
one we are checking it's password complexity) attributes. Is that possible ?
Regards,
Daly
9 years