Hallvard Breien Furuseth wrote:
On 2014-02-03 22:14, Howard Chu wrote:
Was chatting with Emmanuel Lecharny (who is currently working on Mavibot for ApacheDS, an MVCC backend similar to LMDB) and had an interesting realization: we can avoid the current issue of long-lived reader txns preventing page reclamation.
(...) since we're already going to add a txnID to every page's page header, we can simply add a 2nd txnID, recording the txnID of the previous change to this page's ancestor. Then, any page where this prevTxnID is >= the outstanding reader's txnID can be reclaimed.
Nice. But: Seems to me the freelist needs to know when the page was written, and when it was freed. A reader older than the txn which wrote the current contents of a page, is irrelevant to whether the page can be overwritten. How do ancestors matter? (Do you mean a branch page? The age of the previous page contents?)
I meant the age of the previous page contents.
Still thinking about the actual implementation of this, it may make more sense to store the prevTxnID in the freeDB than in each page header. Ideally we want to be able to grab a chunk of pageIDs unambiguously, instead of having to iterate thru each page and read its header to determine if it's safe.
Yes, re-reading lots of pages just to find they can't be used does not sound fun. And we can't look for page headers in a broken-up overflow page. Unless we either quit breaking up freed ovpages, or write a page header to the unused chunk when breaking it up.
Can we grab Mavibot's freeDB structure?
Currently we're just using an array of pageIDs, we can turn that into an array of <pageID,prevTxnID> pairs instead.