Full_Name: Kristopher William Zyp Version: LMDB 0.9.23 OS: Windows URL: fhttps://github.com/kriszyp/node-lmdb/commit/dc290553acb57fa3f2d6d88a0d5e0200... Submission from: (NULL) (71.199.6.148)
In writemap mode in LMDB, it seems that a loose page can end up in me_dpages, causing a segfault on mdb_env_close or on writes after mdb_env_set_mapsize.
If I understand correctly, in MDB_WRITEMAP mode, dirty pages should never enter into me_dpages, since dirty pages directly refer to the mapped memory, and me_dpages are allocated from the heap (for reuse). However, I think a dirty page that has become a loose page can end up in me_dpages; it seems that when mdb_freelist_save() is called, it iterates through the loose pages and can end up calling mdb_dpage_free at https://github.com/LMDB/lmdb/blob/mdb.RE/0.9/libraries/liblmdb/mdb.c#L3114, allowing the loose page to end up in me_dpages. This means that me_dpages ends up with a reference to data in the memory map. And this becomes apparent if you call mdb_env_close or do a mdb_env_set_mapsize triggering a segfault when it attempts to free a memory map'ed page or access previously unmapped page.
Everywhere else in the code, the MDB_WRITEMAP flag prevents any calls to mdb_dpage_free (or mdb_dlist_free) in writemap mode (which I assume is intentional), except in mdb_freelist_save. It seems like the mdb_dpage_free call just needs to be moved up a few lines to the else block of the MDB_TXN_WRITEMAP conditional, so that it also won't be called in writemap mode.
I apologize I can't provide a more isolated test case. This combination actually seems to be pretty rare occurrence, and very difficult to reproduce, I only occasionally observe it happening with large amounts of operations with frequent closing/resizing of the env. And I don't understand the internals to be 100% confident of this, and perhaps I am misunderstanding this code path. But, it does seems like this fix prevents these crashes for us, and seems like ensuring there are no mdb_dpage_free calls in writemap mode (only mdb_page_free calls) is the legitimate intention of the code.
The attached URL contains the fix as a patch (against the node-lmdb project).