Howard Chu hyc@symas.com schrieb am 14.08.2020 um 02:14 in Nachricht
a9c55a18-7a70-1c84-42d7-bf716345591d@symas.com:
Gábor Melis wrote:
Hello
I'm writing a Common Lisp wrapper for LMDB, starting where the previous efforts left off. I have a number of questions related to safety and the color of the smoke after a disaster.
You should consider any misuses as you describe here as fatal, resulting in irreparably corrupted DBs.
lmdb.h says that "A parent transaction and its cursors may not issue any other operations than mdb_txn_commit and mdb_txn_abort while it has active child transactions."
What I observe is that when a cursor associated with the parent transaction is used in the child, there are no errors and the cursor behaves (my test only involved mdb_cursor_put and MDB_SET_KEY) as if it belonged to the child.
Is this to be expected in general or my tests are insufficient and something really bad can happen? If this is a disaster waiting to happen, I need to add checks to the cursor code.
Sounds like your test case was lucky.
- mdb_txns are calloc()ed and free()d. In the case where a thread performs some operation (e.g. put, get, del) involving an already freed mdb_txn pointer, what kind of nastiness can happen? Can the database be corrupted?
The C standard says any references to freed memory result in undefined behavior. Nobody can give you a more specific answer than that.
Same question about mdb_cursors.
Async unwind safety. This is a bit like a thread being destroyed in the middle of an lmdb function call.
Context: In some Common Lisp implementations (SBCL), Posix interrupts like SIGINT are used during development. If the developer presses C-c the lisp debugger will start where the signal handler was invoked, which may be in the middle of some mdb_* call. Depending on the actions taken, the stack (both the lisp and the C stack) may be unwound to some earlier frame. Another example is async timeouts (SBCL's WITH-TIMEOUT) can also unwind the stack. I understand that async unwinds are unsafe in general.
There is a way to defer handling of interrupts, which I already use to protect allocations (mdb_txn_begin, mdb_txn_commit and similar), but it has a small performance cost and I hesitate to apply it to performance hotspots (e.g. put, get, del and most cursor ops). Are [some of] these functions safe in face of async unwinds? What kind of problem may arise?
In a default build, read txns are always safe. No guarantees on an interrupted write txn.
But still a valid issue is: Can some extra "debugging" protection code be added to locate such problems in LMDB? Examples I could think of: Assigning NULL to pointers that were freed, assigning -1 for file descriptors that were closed, etc. Then you'd get a core dump on modern architectures after such constellations at least.
Cheers, Gábor Melis
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/