Hello,
we are using LMDB as the underlying storage engine for a lightweight + high performance special-purpose object + mass data database. I have 2 questions about the size of the physical file used by LMDB:
To create the environment, we are using a mapsize of 1 GiB and the flags MDB_NOSUBDIR | MDB_NOLOCK. Under Linux, this results in one file with a size that seems to correspond to the size of the data actually stored. However, under Windows, the file size is the same as the mapsize, namely 1 GiB. We are currently using env_copy2 and MDB_COMPACT to push this down every time the env is closed, but I fear that this will become very slow with large databases.
The same issue surfaced under Linux when we were recently experimenting with the MDB_WRITEMAP option to improve performance when dealing with very large data sets. This option caused also the Linux file size to go up to 1 GiB, even though the actual data was < 50 K.
We'd like to hear if there are ways to improve this.
thanks, Christian
Christian Sell wrote:
Hello,
we are using LMDB as the underlying storage engine for a lightweight + high performance special-purpose object + mass data database. I have 2 questions about the size of the physical file used by LMDB:
To create the environment, we are using a mapsize of 1 GiB and the flags MDB_NOSUBDIR | MDB_NOLOCK. Under Linux, this results in one file with a size that seems to correspond to the size of the data actually stored. However, under Windows, the file size is the same as the mapsize, namely 1 GiB. We are currently using env_copy2 and MDB_COMPACT to push this down every time the env is closed, but I fear that this will become very slow with large databases.
The same issue surfaced under Linux when we were recently experimenting with the MDB_WRITEMAP option to improve performance when dealing with very large data sets. This option caused also the Linux file size to go up to 1 GiB, even though the actual data was < 50 K.
We'd like to hear if there are ways to improve this.
No.
This is how memory mapped files work on Windows. There is no way to change that.
https://msdn.microsoft.com/en-us/library/windows/desktop/aa366537%28v=vs.85%...
Likewise for writable mmaps on POSIX systems. Read your operating system documentation.
On 09/11/15 18:47, Christian Sell wrote:
To create the environment, we are using a mapsize of 1 GiB and the flags MDB_NOSUBDIR | MDB_NOLOCK. Under Linux, this results in one file with a size that seems to correspond to the size of the data actually stored. However, under Windows, the file size is the same as the mapsize, namely 1 GiB. (...) The same issue surfaced under Linux (...) with the MDB_WRITEMAP option
That's the logical size, which can be bigger than the physical size. In lmdb's case, the end of the file doesn't use any disk space. On filesystems which support this, anyway. Most do. So, nevermind mdb_copy - there is no problem to fix.
On Unix, 'du <file>' shows disk usage. Don't know about Windows.
When you want to copy the file anyway, you should use mdb_copy rather than plain filecopy. And MDB_COMPACT does shrink the file somewhat since it drops pages which LMDB has freed and not yet reused, but that's another matter. The DB would grow later anyway, LDMB does need pages it can write to.
Hallvard
--On Monday, November 09, 2015 11:00 PM +0100 Hallvard Breien Furuseth h.b.furuseth@usit.uio.no wrote:
On 09/11/15 18:47, Christian Sell wrote:
To create the environment, we are using a mapsize of 1 GiB and the flags MDB_NOSUBDIR | MDB_NOLOCK. Under Linux, this results in one file with a size that seems to correspond to the size of the data actually stored. However, under Windows, the file size is the same as the mapsize, namely 1 GiB. (...) The same issue surfaced under Linux (...) with the MDB_WRITEMAP option
That's the logical size, which can be bigger than the physical size. In lmdb's case, the end of the file doesn't use any disk space. On filesystems which support this, anyway. Most do. So, nevermind mdb_copy - there is no problem to fix.
On Unix, 'du <file>' shows disk usage. Don't know about Windows.
When you want to copy the file anyway, you should use mdb_copy rather than plain filecopy. And MDB_COMPACT does shrink the file somewhat since it drops pages which LMDB has freed and not yet reused, but that's another matter. The DB would grow later anyway, LDMB does need pages it can write to.
Here's a real world example:
[zimbra@ldap01 db]$ ls -l data.mdb -rw------- 1 zimbra zimbra 17967149056 Nov 9 16:19 data.mdb [zimbra@ldap01 db]$ du -c -h data.mdb 76M data.mdb 76M total
I.e., real usage is 76MB vs the approximately 17GB configured max size.
--Quanah
--
Quanah Gibson-Mount Platform Architect Zimbra, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
Hallvard Breien Furuseth h.b.furuseth@usit.uio.no schrieb am 09.11.2015 um
23:00 in Nachricht 56411792.8080205@usit.uio.no:
On 09/11/15 18:47, Christian Sell wrote:
To create the environment, we are using a mapsize of 1 GiB and the flags MDB_NOSUBDIR | MDB_NOLOCK. Under Linux, this results in one file with a size that seems to correspond to the size of the data actually stored. However,
under
Windows, the file size is the same as the mapsize, namely 1 GiB. (...) The same issue surfaced under Linux (...) with the MDB_WRITEMAP option
That's the logical size, which can be bigger than the physical size. In lmdb's case, the end of the file doesn't use any disk space. On filesystems which support this, anyway. Most do. So, nevermind mdb_copy - there is no problem to fix.
On Unix, 'du <file>' shows disk usage. Don't know about Windows.
When you want to copy the file anyway, you should use mdb_copy rather than plain filecopy. And MDB_COMPACT does shrink the file somewhat since it drops pages which LMDB has freed and not yet reused, but that's another matter. The DB would grow later anyway, LDMB does need pages it can write to.
I wonder as SSD become more and more common: Should LMDB have a way to signal to the operating system that some parts of the file are no longer in use? So the OS->filesystem->blockdevice could actually reclaim the space.
Hallvard
Ulrich Windl wrote:
Hallvard Breien Furuseth h.b.furuseth@usit.uio.no schrieb am 09.11.2015 um
23:00 in Nachricht 56411792.8080205@usit.uio.no:
On 09/11/15 18:47, Christian Sell wrote:
To create the environment, we are using a mapsize of 1 GiB and the flags MDB_NOSUBDIR | MDB_NOLOCK. Under Linux, this results in one file with a size that seems to correspond to the size of the data actually stored. However,
under
Windows, the file size is the same as the mapsize, namely 1 GiB. (...) The same issue surfaced under Linux (...) with the MDB_WRITEMAP option
That's the logical size, which can be bigger than the physical size. In lmdb's case, the end of the file doesn't use any disk space. On filesystems which support this, anyway. Most do. So, nevermind mdb_copy - there is no problem to fix.
On Unix, 'du <file>' shows disk usage. Don't know about Windows.
When you want to copy the file anyway, you should use mdb_copy rather than plain filecopy. And MDB_COMPACT does shrink the file somewhat since it drops pages which LMDB has freed and not yet reused, but that's another matter. The DB would grow later anyway, LDMB does need pages it can write to.
I wonder as SSD become more and more common: Should LMDB have a way to
signal to the operating system that some parts of the file are no longer in use? So the OS->filesystem->blockdevice could actually reclaim the space.
No.
Pages deleted in one transaction will be reused in a subsequent transaction. There's no benefit to telling the OS to deallocate it since it will just need to be reallocated again shortly after. It will kill both performance overall, issuing extraneous filesystem ops, and kill the SSD itself, issuing extraneous metadata updates to the device, causing it to wear out faster.
LMDB manages pages the way it does *because that is the optimal way to do so*.
openldap-technical@openldap.org