Full_Name: David Wilson
Version: LMDB 0.9.11
Submission from: (NULL) (184.108.40.206)
Currently if a user (wilfully or accidentally, say, through composition of
third party libs) opens the same LMDB environment twice within a process, and
maintains active read transactions at the time one mdb_env_close() is called,
all reader slots will be deallocated in all environments due to the logic
around line 4253 that unconditionally clears reader slots based on PID.
I'd like to avoid this in py-lmdb, and seems there are a few alternatives to
1) Tell users not to do that (doc fix, nobody reads docs)
This is already in the doc - don't open the same env twice within one process.
2) Externally maintain a set of (st_dev, st_ino) keys representing
opened environments, and check any attempt to open an environment against this
list, failing if data.mdb's key already appears.
That would require global library state, which is definitely unwanted.
3) Disable the mdb_env_close() cleanup loop when MDB_NOTLS is
enabled. Since no
reader slot will ever exist in TLS in this mode, the existing mdb_env_close()
doc "All transactions, databases, and cursors must already be closed before
calling this function." ensures no readers exist at close time, and the loop
need not run.
Could do this, but since it's an incomplete solution, not sure there's any point.
4) Modify lock.mdb to include MDB_env* address within the process,
mdb_env_close() to invalidate only readers associated with the environment
being closed. I dislike using the MDB_env* as an opaque publicly visible
cookie, but perhaps storing this field might be reused for other things in
Hmm. I guess this could work. I think storing the MDB_env* is harmless since
the pointer is meaningless to any other process.
Option 3 lets the user wilfully mix py-lmdb Environment objects
process, since py-lmdb always enables NOTLS, but it does not fix the case where
the user is integrating a Python application with a C library with some opaque
interface, and both the application and the library independently have need to
access the environment.
Allowing multiple independent libraries to open the same environment would
consume redundant chunks of address space. For large enough DBs this may
Unfortunately the semantics of fcntl locks prevents us from even detecting
that this problem has occurred.
This kind of scenario regularly crops up in huge software projects,
may not even be aware the library uses LMDB internally. In that case, the user
is again exposed to having their reader slots silently become deallocated
through no fault of their own.
Option 4 is the least aesthetic, but has all the benefits of option 3 in
addition to preventing the "accidental integration" scenario. I've attached
patch for it (against 0.9.11), although I completely understand if it does not
get applied. :)
One alternative would be instead of using MDB_env* address, using a
monotonically increasing counter based on a static global, say, "mrb_token",
but this might break in the case two copies of LMDB were somehow linked in the
application. (It's easy to argue the user is an idiot in this case ;)
Global state is verboten.
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/