https://bugs.openldap.org/show_bug.cgi?id=9564
Issue ID: 9564 Summary: Race condition with freeing the spilled pages from transaction Product: LMDB Version: 0.9.29 Hardware: Other OS: Mac OS Status: UNCONFIRMED Severity: normal Priority: --- Component: liblmdb Assignee: bugs@openldap.org Reporter: kriszyp@gmail.com Target Milestone: ---
Created attachment 825 --> https://bugs.openldap.org/attachment.cgi?id=825&action=edit Free the spilled pages and dirty overflows before unlocking the write mutex
The spilled pages (a transaction's mt_spill_pgs) is freed *after* a write transaction's mutex is unlocked (in mdb.master3). This can result in a race condition where a second transaction can start and subsequently assign a new mt_spill_pgs list to the transaction structure that it reuses. If this occurs after the first transaction unlocks the mutex, but before it performs the free operation on mt_spill_pgs, then the first transaction will end up calling a free on the same spilled page list as the second transaction, resulting in a double free (and crash).
It would seem to be an extremely unlikely scenario to actually happen, since the free call is literally the next operation after the mutex is unlocked, and the second transaction would need to make it all the way to the point of saving the freelist before a page spill list is likely to be allocated. Consequently, this probably has rarely happened. However, one of our users (see https://github.com/DoctorEvidence/lmdb-store/issues/48 for the discussion) has noticed this occurring, and it seems that it may be particularly likely to happen on MacOS on the new M1 silicon. Perhaps there is some peculiarity to how the threads are more likely to yield execution after a mutex unlock, I am not sure. I was able to reproduce the issue by intentionally manipulating the timing (sleeping before the free) to verify that the race condition is technically feasible, and apparently this can happen "in the wild" on MacOS, at least with an M1.
It is also worth noting that this is due to (or a regression from) the fix for ITS#9155 (https://github.com/LMDB/lmdb/commit/cb256f409bb53efeda4ac69ee8165a0b4fc1a277) where the free call was moved outside the conditional (for having a parent) that had previously never been executed after the mutex is unlocked, but now is called after the unlock.
Anyway, the solution is relatively simple, the free call simply has to be moved above the unlock. In my patch, I also moved the free call for mt_dirty_ovs. I am not sure what OVERFLOW_NOTYET/mt_dirty_ovs is for, but presumably it should be handled the same. This could alternately be solved by saving the reference to these lists before unlocking, and freeing after unlocking, which would slightly decrease the amount of processing within the mutex guarded code. Let me know if you prefer a patch that does that.
https://bugs.openldap.org/show_bug.cgi?id=9564
--- Comment #1 from Howard Chu hyc@openldap.org ---
Anyway, the solution is relatively simple, the free call simply has to be moved above the unlock. In my patch, I also moved the free call for mt_dirty_ovs. I am not sure what OVERFLOW_NOTYET/mt_dirty_ovs is for, but presumably it should be handled the same.
The NOTYET is W-i-p and not usable yet.