Hallvard Breien Furuseth:
[Due to a typo in your e-mail address, the ITS system did not mail
this message anywhere. So I'm CC'ing Howard directly, just in case.]
Sorry about that. Seems everyone screws up sometimes.
Wietse Venema writes:
> I wrote a test driver that reliably causes LMDB to abort during a
> simulated cache cleanup. This "exploit" produces the same result
> on Linux and FreeBSD, 32-bit and 64-bit systems.
You're using an old read-only transaction which cannot coexist with:
- mdb_env_set_mapsize() which moves the map which a cursor in
the reader is using.
- several write-transactions + MDB_NOLOCK. The flag means the writers do
not know about the reader, so they reuse pages from the snapshot the
reader is using. The reader can survive while the metapages hold
on to its snapshot, i.e. 1 or 2 write commits (I think).
I don't know if this is a thinko in your program or miscommunication
between you and Howard about MDB_NOLOCK and mapsize changes. With the
current liblmdb, a map change should involve: Remember the reader's
current position (key), resize the map, renew the txn and cursor, and
reposition the cursor.
Why do you talk about map size changes when I delete a database entry??
I need a software abstraction layer between Postfix and LMDB that
provides the following generic interface that is independent of
1) Give me the first or the next element in the database.
2) Delete, look up, update, a specific database entry.
3) Operations under 1) and 2) must be possible to interleave.
4) Locking must be done outside LMDB, because world-writable lock
files are not an option.
(There is more, but the above is required to use LMDB as a cache
with periodic cleanup. I am not makig up stuff here - the above
interface has been used in Postfix for many years and it works with
all supported databases that have an iterator.
Sofar the abstraction layer already hides the LMDB-specific MAP_FULL
and MAP_RESIZED error conditions. If this abstraction layer needs
additional code in order to maintain MDB cursor sanity, then please
I had expected that LMDB takes care of its cursors itself, since
the side effects of other API calls are known inside LMDB only.
The test works if I (a) turn MDB_NOLOCK into MDB_NOTLS (I know
not what you want), and (b) detect map changes in mdb_cursor_get() and
update the cursor to match.
Old 'MDB_val's the reader fetched, are invalid after the mapsize change.
Also, remember that long-lived read-only transactions which write
transactions do know about, prevent them from reusing pages the reader
snapshot is using - resulting in further map growth.