In openldap-technical, Quanah Gibson-Mount wrote:
As per Howard Chu (Author of MDB, Primary OpenLDAP Developer):
Full details are in the paper. http://www.openldap.org/pub/hyc/mdm-paper.pdf
MDB assumes a unified buffer cache. See section 3.1, references 17, 18, and 19.
Note that this requirement can be relaxed in the current version of the library. If you create the environment with the MDB_WRITEMAP option then all reads and writes are performed using mmap, so the file buffer cache is irrelevant. Of course then you lose the protection that the read-only map offers.
That's not quite true. mdb_env_open() does a read() of the meta pages.
I presume this can only be a problem when other processes have the database open? In that situation, I think the read() can be avoided by maintaining a copy of the relevant MDB_meta information in the lock file. The read() only needs enough info to know how to map the data file.
This would require a version increase for the lock file and programs using it, but not for the database file.
Hallvard Breien Furuseth wrote:
In openldap-technical, Quanah Gibson-Mount wrote:
As per Howard Chu (Author of MDB, Primary OpenLDAP Developer):
Full details are in the paper. http://www.openldap.org/pub/hyc/mdm-paper.pdf
MDB assumes a unified buffer cache. See section 3.1, references 17, 18, and 19.
Note that this requirement can be relaxed in the current version of the library. If you create the environment with the MDB_WRITEMAP option then all reads and writes are performed using mmap, so the file buffer cache is irrelevant. Of course then you lose the protection that the read-only map offers.
That's not quite true. mdb_env_open() does a read() of the meta pages.
I presume this can only be a problem when other processes have the database open? In that situation, I think the read() can be avoided by maintaining a copy of the relevant MDB_meta information in the lock file. The read() only needs enough info to know how to map the data file.
This would require a version increase for the lock file and programs using it, but not for the database file.
Since, as you note, the read only needs to see the map size, none of this work is necessary.
Howard Chu writes:
From: Howard Chu hyc@symas.com To: Hallvard Breien Furuseth h.b.furuseth@usit.uio.no, openldap-devel@openldap.org Subject: Re: Avoiding read() of meta pages Date: Tue, 15 Jan 2013 20:56:40 +0000
Hallvard Breien Furuseth wrote:
I presume this can only be a problem when other processes have the database open? In that situation, I think the read() can be avoided by maintaining a copy of the relevant MDB_meta information in the lock file. The read() only needs enough info to know how to map the data file.
This would require a version increase for the lock file and programs using it, but not for the database file.
Since, as you note, the read only needs to see the map size, none of this work is necessary.
Why not? The mapsize can change. mdb_env_write_meta(): /* Persist any increases of mapsize config */. It can also shrink during env_open, but I think that was a mistake as written. Will get back to that later.