Hallvard Breien Furuseth:
[Due to a typo in your e-mail address, the ITS system did not mail out this message anywhere. So I'm CC'ing Howard directly, just in case.]
Sorry about that. Seems everyone screws up sometimes.
Wietse Venema writes:
I wrote a test driver that reliably causes LMDB to abort during a simulated cache cleanup. This "exploit" produces the same result on Linux and FreeBSD, 32-bit and 64-bit systems.
You're using an old read-only transaction which cannot coexist with:
- mdb_env_set_mapsize() which moves the map which a cursor in the reader is using.
- several write-transactions + MDB_NOLOCK. The flag means the writers do not know about the reader, so they reuse pages from the snapshot the reader is using. The reader can survive while the metapages hold on to its snapshot, i.e. 1 or 2 write commits (I think).
I don't know if this is a thinko in your program or miscommunication between you and Howard about MDB_NOLOCK and mapsize changes. With the current liblmdb, a map change should involve: Remember the reader's current position (key), resize the map, renew the txn and cursor, and reposition the cursor.
Why do you talk about map size changes when I delete a database entry??
I need a software abstraction layer between Postfix and LMDB that provides the following generic interface that is independent of LMDB internals:
1) Give me the first or the next element in the database.
2) Delete, look up, update, a specific database entry.
3) Operations under 1) and 2) must be possible to interleave.
4) Locking must be done outside LMDB, because world-writable lock files are not an option.
(There is more, but the above is required to use LMDB as a cache with periodic cleanup. I am not makig up stuff here - the above interface has been used in Postfix for many years and it works with all supported databases that have an iterator.
Sofar the abstraction layer already hides the LMDB-specific MAP_FULL and MAP_RESIZED error conditions. If this abstraction layer needs additional code in order to maintain MDB cursor sanity, then please educate me.
I had expected that LMDB takes care of its cursors itself, since the side effects of other API calls are known inside LMDB only.
Wietse
The test works if I (a) turn MDB_NOLOCK into MDB_NOTLS (I know that's not what you want), and (b) detect map changes in mdb_cursor_get() and update the cursor to match. Old 'MDB_val's the reader fetched, are invalid after the mapsize change. Also, remember that long-lived read-only transactions which write transactions do know about, prevent them from reusing pages the reader snapshot is using - resulting in further map growth.
-- Hallvard