h.b.furuseth@usit.uio.no wrote:
Full_Name: Hallvard B Furuseth Version: LMDB 0.9.11 OS: Linux x86_64 URL: Submission from: (NULL) (129.240.6.254) Submitted by: hallvard
Mapsize changes do not work as described, do not reliably store the mapsize in the map, and it's hard to see how it is supposed to work. E.g.:
- Open an environment twice, in processes X and Y.
- X grows the map and writes (commits) something to the DB. That MDB_meta gets the new mapsize.
- Y writes to the DB. It does not get MDB_MAP_RESIZED like the doc says, nor does it carry forward X's MDB_meta.mm_mapsize change.
The doc says the caller of set_mapsize is required to make sure there are no active transactions when it is called. As such, X failed this requirement, and this sequence of events is explicitly unsupported.
If Y doesn't start its write txn until after X finishes, then Y will see the new size.
- Process Z opens the environment without doing set_mapsize(), and gets the original mapsize from the MDB_meta written by Y.
For that matter, from reading the doc I'd expect a mapsize change to commit a txn with the new mapsize. There's no mention that the change (and the MDB_MAP_RESIZED) will wait for something to be committed.
mdb_txn_commit() writes nothing if the txn didn't change anything. It needs to notice that there is a mapsize change to write.
The doc talks about shrinking the map, but reduced mapsizes are not written to the datafile. Only increases are written.
All in all, it looks to me like _changing_ the mapsize should be an operation on a write transaction or invoke a write transaction, while setting the size or catching up with a mapsize change can be an environment operation. That way it would be possible to make sense of it. A txn can do it when it has no cursors and no dirty WRITEMAP pages (or WRITEMAP could spill all pages first).
BTW, I don't see the point of conditionally avoiding to write the mapsize in mdb_env_write_meta() when full page gets written to disk anyway - as long as txn_begin() stashes the mapsize from the original meta so it knows what to write. (It need not obey the mapsize at that point, but it must carry a change forward.)