https://bugs.openldap.org/show_bug.cgi?id=10095
Issue ID: 10095 Summary: Race condition causing corruption of mutexes when closing the database Product: LMDB Version: 0.9.30 Hardware: x86_64 OS: Linux Status: UNCONFIRMED Keywords: needs_review Severity: normal Priority: --- Component: liblmdb Assignee: bugs@openldap.org Reporter: peter@peterzhu.ca Target Milestone: ---
We're running into a race condition across multiple processes causing the corruption of mutexes when a process closes the database caused by the fix for https://bugs.openldap.org/show_bug.cgi?id=9278 (commit https://git.openldap.org/openldap/openldap/-/commit/f683ffdc81d0edb20437cb7d...).
Here's the interleaving of two processes (p0 and p1) that can cause this situation.
p0: Opens connection to database using mdb_env_create and mdb_env_open.
...some things happen in between...
p0: Begins closing the database using mdb_env_close: p0: Calls mdb_env_close0: p0: Acquires write lock on the file lock using mdb_env_excl_lock. p0: Calls pthread_mutex_destroy on the mutexes.
SWITCH TO p1
p1: Begins opening the database using mdb_env_create. Then calls mdb_env_open, in mdb_env_open: p1: Calls mdb_env_setup_locks: p1: Calls mdb_env_excl_lock, but it's unable to acquire a write file lock due to p0 holding the write file lock. It waits on acquiring a read file lock.
SWITCH TO p0
p0: Calls close on the file descriptor which releases the write lock.
SWITCH TO p1
p1: Acquires the read file lock. p1: Does NOT call pthread_mutex_init since it did not acquire a write file lock.
...some things happen in between...
p1: Try to lock the mutex using pthread_mutex_lock. This call fails with a EINVAL due to locking a destroyed mutex.
I'm not sure how to actually solve this problem. We're currently mitigating this problem by reverting the commit linked above (so no mutexes get destroyed).
https://bugs.openldap.org/show_bug.cgi?id=10095
--- Comment #1 from Howard Chu hyc@openldap.org --- We had a discussion of this problem before, but I don't recall where. My suggestion was to immediately attempt to change the readlock to a writelock after acquiring the readlock. (Again, with no wait.) The only objection I recall was that this may significantly delay env open operations, but I don't think that's true, if we use F_SETLK and not F_SETLKW.
https://bugs.openldap.org/show_bug.cgi?id=10095
--- Comment #2 from Peter Zhu peter@peterzhu.ca --- Thank you for the quick reply. I considered doing the try to acquire write lock, acquire read lock, then try to acquire write lock approach. But I think there's still an issue if two or more processes (e.g. p1 and p2) attempt to connect to the database. The issue looks like the following:
p0: Opens connection to database using mdb_env_create and mdb_env_open.
...some things happen in between...
p0: Begins closing the database using mdb_env_close: p0: Calls mdb_env_close0: p0: Acquires write lock on the file lock using mdb_env_excl_lock. p0: Calls pthread_mutex_destroy on the mutexes.
SWITCH TO p1 and p2
p1, p2: Begins opening the database using mdb_env_create. Then calls mdb_env_open, in mdb_env_open: p1, p2: Calls mdb_env_setup_locks: p1, p2: Calls mdb_env_excl_lock, but it's unable to acquire a write file lock due to p0 holding the write file lock. It waits on acquiring a read file lock.
SWITCH TO p0
p0: Calls close on the file descriptor which releases the write file lock.
SWITCH TO p1, p2
p1, p2: Acquires the read file lock. p1, p2: Fails to acquire the write file lock due to both p1 and p2 holding a read file lock. p1, p2: Does NOT call pthread_mutex_init since it did not acquire a write file lock.
...some things happen in between...
p1, p2: Try to lock the mutex using pthread_mutex_lock. This call fails with a EINVAL due to locking a destroyed mutex.
https://bugs.openldap.org/show_bug.cgi?id=10095
--- Comment #3 from Howard Chu hyc@openldap.org --- We can add a flag to the lockfile for "mutex is valid" but we still wouldn't have a good way to resolve which of p1 or p2 should do the initialization then.
And p0 has no way to know that other processes are waiting to open the env, in which case it could just skip the mutex_destroy.
https://bugs.openldap.org/show_bug.cgi?id=10095
--- Comment #4 from Howard Chu hyc@openldap.org --- Probably we should revert the ITS#9278 patch. Instead, the fact that the FreeBSD thread library breaks if the mutex is unamapped should be treated as a FreeBSD bug.
https://bugs.openldap.org/show_bug.cgi?id=10095
--- Comment #5 from Peter Zhu peter@peterzhu.ca ---
We can add a flag to the lockfile for "mutex is valid"
I think this will guarantee that this bug does not occur, but I think that there is a chance of a livelock since p1 and p2 can be stuck in a cycle of acquire read lock, check that the "mutex is valid" is not set, try to acquire a write lock, fail because both are holding a read lock, release read lock and try again. It might be able to mitigate this by performing random backoff, but that's probably bad for performance.
Probably we should revert the ITS#9278 patch.
That's what we did in our production systems, it seems to have resolved the issue. AFAIK Linux does not allocate any memory in `pthread_mutex_init`, so not calling `pthread_mutex_destroy` shouldn't leak memory (although according to specification we're supposed to call `pthread_mutex_destroy` when we're done using it).
https://bugs.openldap.org/show_bug.cgi?id=10095
Howard Chu hyc@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://bugs.openldap.org/s | |how_bug.cgi?id=9278
https://bugs.openldap.org/show_bug.cgi?id=10095
Howard Chu hyc@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://bugs.openldap.org/s | |how_bug.cgi?id=10058
https://bugs.openldap.org/show_bug.cgi?id=10095
Howard Chu hyc@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |jiri.novosad@gmail.com
--- Comment #6 from Howard Chu hyc@openldap.org --- *** Issue 10058 has been marked as a duplicate of this issue. ***
https://bugs.openldap.org/show_bug.cgi?id=10095
Howard Chu hyc@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |RESOLVED Resolution|--- |TEST
--- Comment #7 from Howard Chu hyc@openldap.org --- Fixed in 3dde6c46e6c55458eadaf7f81492c822414be2c7
https://bugs.openldap.org/show_bug.cgi?id=10095
--- Comment #8 from Howard Chu hyc@openldap.org --- The FreeBSD team acknowledges this was a bug in their threads library, and it has since been fixed. See discussion in ITS#9278.
https://bugs.openldap.org/show_bug.cgi?id=10095
--- Comment #9 from Peter Zhu peter@peterzhu.ca --- Thank you for fixing this issue!
https://bugs.openldap.org/show_bug.cgi?id=10095
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|--- |0.9.32 Keywords|needs_review |
https://bugs.openldap.org/show_bug.cgi?id=10095
--- Comment #10 from Quanah Gibson-Mount quanah@openldap.org --- re0.9: commit ce200dca1d648f696157e3e49b1800480fef1acb Author: Howard Chu hyc@openldap.org