Other than this thread:
http://t23307.network-openldap-technical.opennetworks.info/lmdb-growing-the-...
I don't see a discussion of changing the 'maxsize' setting after a LMDB database has been generated.
This thread includes this response about growing the database:
http://www.openldap.org/lists/openldap-technical/201402/msg00302.html
On Windows, Linux, and FreeBSD, there's no problem increasing the mapsize and preserving the existing data.
(I'm making a wild assumption that 'mapsize' is a typo, and 'maxsize' was intended.)
Can 'maxsize' ever be reduced after the fact? If so, is their guidance as to how much it can change (perhaps based on mdb_stat)?
The problem I'm trying to solve:
For my $job, we provide OpenLDAP-backed clustered appliances to customers. The hardware doesn't vary, but the size of individual customers' databases.
- Our strategy for adding members to the cluster involves managing backups (compressed tarballs). Our prior use of the now-ancient bdb backend let these backups be lightweight things to manage for smaller customers, and larger customers would take the hit for having big databases.
- Also, upgrading appliances means importing data from the customers' bdb-based server.
My naive use of the LDMB backend has me assume the worst case, and now everyone is equally punished for having a 'big' (albeit sparse) database.
My hope was to, given awareness of either the data in an LDIF extract, or data about the legacy bdb database itself, we could make a more conservative guess as to a reasonable size for the mdb backend.
Has anyone written up some strategies on these topics, or in the position to provide any recommendation?
Brian Reichert wrote:
Other than this thread:
http://t23307.network-openldap-technical.opennetworks.info/lmdb-growing-the-...
I don't see a discussion of changing the 'maxsize' setting after a LMDB database has been generated.
This thread includes this response about growing the database:
http://www.openldap.org/lists/openldap-technical/201402/msg00302.html
On Windows, Linux, and FreeBSD, there's no problem increasing the mapsize and preserving the existing data.
(I'm making a wild assumption that 'mapsize' is a typo, and 'maxsize' was intended.)
maxsize is the back-mdb keyword. mapsize is the LMDB API property. They both refer to the same thing. We used the word "maxsize" for back-mdb to impress upon sysadmins that this really is a long-term maximum, and not a setting that should be tuned on an ongoing basis.
Can 'maxsize' ever be reduced after the fact? If so, is their guidance as to how much it can change (perhaps based on mdb_stat)?
Read the LMDB documentation.
The problem I'm trying to solve:
For my $job, we provide OpenLDAP-backed clustered appliances to customers. The hardware doesn't vary, but the size of individual customers' databases.
Our strategy for adding members to the cluster involves managing backups (compressed tarballs). Our prior use of the now-ancient bdb backend let these backups be lightweight things to manage for smaller customers, and larger customers would take the hit for having big databases.
Also, upgrading appliances means importing data from the customers' bdb-based server.
My naive use of the LDMB backend has me assume the worst case, and now everyone is equally punished for having a 'big' (albeit sparse) database.
"Punished"? There is no penalty for configuring a large maxsize, no matter how small the actual data.
My hope was to, given awareness of either the data in an LDIF extract, or data about the legacy bdb database itself, we could make a more conservative guess as to a reasonable size for the mdb backend.
There's no reason to bother doing this.
Has anyone written up some strategies on these topics, or in the position to provide any recommendation?
On Thu, Aug 21, 2014 at 03:28:42PM -0700, Howard Chu wrote:
maxsize is the back-mdb keyword. mapsize is the LMDB API property. They both refer to the same thing. We used the word "maxsize" for back-mdb to impress upon sysadmins that this really is a long-term maximum, and not a setting that should be tuned on an ongoing basis.
Ok, that explains the terminology.
Can 'maxsize' ever be reduced after the fact? If so, is their guidance as to how much it can change (perhaps based on mdb_stat)?
Read the LMDB documentation.
What, this: http://symas.com/mdb/doc/ ?
A search for 'maxsize' or 'mapsize' yeilds no hits.
The mdb_stats manpage tells me how to invoke it, but not how to interpret the results. I have seen other messages to this mailing list provide some guidance, but nothing that seemed to directly apply to my questions. Perhaps I'm missing some keyword somewhere.
My naive use of the LDMB backend has me assume the worst case, and now everyone is equally punished for having a 'big' (albeit sparse) database.
"Punished"? There is no penalty for configuring a large maxsize, no matter how small the actual data.
The 'punishment' is multifold:
- consumption of diskspace for storage (our database is stored on the same partition as our backups; perhaps not the best of plans). - the time it takes to compress/uncompress a backup. - the network bandwidth cost of transmitting a file that's larger than it needs to be.
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
Brian Reichert wrote:
On Thu, Aug 21, 2014 at 03:28:42PM -0700, Howard Chu wrote:
maxsize is the back-mdb keyword. mapsize is the LMDB API property. They both refer to the same thing. We used the word "maxsize" for back-mdb to impress upon sysadmins that this really is a long-term maximum, and not a setting that should be tuned on an ongoing basis.
Ok, that explains the terminology.
Can 'maxsize' ever be reduced after the fact? If so, is their guidance as to how much it can change (perhaps based on mdb_stat)?
Read the LMDB documentation.
What, this: http://symas.com/mdb/doc/ ?
A search for 'maxsize' or 'mapsize' yeilds no hits.
Seriously? http://symas.com/mdb/doc/group__mdb.html#gaa2506ec8dab3d969b0e609cd82e619e5
The mdb_stats manpage tells me how to invoke it, but not how to interpret the results. I have seen other messages to this mailing list provide some guidance, but nothing that seemed to directly apply to my questions. Perhaps I'm missing some keyword somewhere.
My naive use of the LDMB backend has me assume the worst case, and now everyone is equally punished for having a 'big' (albeit sparse) database.
"Punished"? There is no penalty for configuring a large maxsize, no matter how small the actual data.
The 'punishment' is multifold:
- consumption of diskspace for storage (our database is stored on the same partition as our backups; perhaps not the best of plans).
Nonsense. It is a sparse file and doesn't consume any more space than is actually being used.
- the time it takes to compress/uncompress a backup.
Nonsense. Use mdb_copy to take the backup.
- the network bandwidth cost of transmitting a file that's larger than it needs to be.
Nonsense.
On Thu, Aug 21, 2014 at 07:14:54PM -0700, Howard Chu wrote:
Brian Reichert wrote:
What, this: http://symas.com/mdb/doc/ ?
A search for 'maxsize' or 'mapsize' yeilds no hits.
Seriously? http://symas.com/mdb/doc/group__mdb.html#gaa2506ec8dab3d969b0e609cd82e619e5
Yes, seriously. I typed those keywords onto the search bar on the upper right, and each time, the dialog that pops up says 'No Matches'.
Supplying the 'mdb_env_set_mapsize' string from that URL you supplied does yield a hit, so searching, per se, does work.
That does describe the consequences of resizing downward, so thanks for that pointer.
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
--On Thursday, August 21, 2014 6:02 PM -0400 Brian Reichert reichert@numachi.com wrote:
Has anyone written up some strategies on these topics, or in the position to provide any recommendation?
Zimbra already does all of these things using writemap. You could look at what we've done in relation to this. But I agree with Howard, there's generally little advantage to downsizing the writemap.
--Quanah
--
Quanah Gibson-Mount Server Architect Zimbra, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
On Thu, Aug 21, 2014 at 03:55:09PM -0700, Quanah Gibson-Mount wrote:
Zimbra already does all of these things using writemap. You could look at what we've done in relation to this. But I agree with Howard, there's generally little advantage to downsizing the writemap.
Ok, noted, downsizing is not an appropriate tactic. That means making a reasonable good guess to start with, something short of 'use the whole disk'.
You've suggested writemap in response to other questions I've asked on this list; I think I shall take the hint. :)
--Quanah
--
Quanah Gibson-Mount Server Architect Zimbra, Inc.
Zimbra :: the leader in open source messaging and collaboration
--On Thursday, August 21, 2014 10:53 PM -0400 Brian Reichert reichert@numachi.com wrote:
On Thu, Aug 21, 2014 at 03:55:09PM -0700, Quanah Gibson-Mount wrote:
Zimbra already does all of these things using writemap. You could look at what we've done in relation to this. But I agree with Howard, there's generally little advantage to downsizing the writemap.
Ok, noted, downsizing is not an appropriate tactic. That means making a reasonable good guess to start with, something short of 'use the whole disk'.
You've suggested writemap in response to other questions I've asked on this list; I think I shall take the hint. :)
Is your OS a BSD or Linux?
Last time I tried on BSD, it didn't use sparse files with writemap like Linux does. It just made a massive file on disk of the size specified for the maxsize. I.e., BSD didn't correctly support sparse files with writemap.
--Quanah
--
Quanah Gibson-Mount Server Architect Zimbra, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
On Fri, Aug 22, 2014 at 12:11:51AM -0700, Quanah Gibson-Mount wrote:
--On Thursday, August 21, 2014 10:53 PM -0400 Brian Reichert
You've suggested writemap in response to other questions I've asked on this list; I think I shall take the hint. :)
Is your OS a BSD or Linux?
Sorry, I did fail to present that detail; this is a Linux distro, CentOS 6.5.
Last time I tried on BSD, it didn't use sparse files with writemap like Linux does. It just made a massive file on disk of the size specified for the maxsize. I.e., BSD didn't correctly support sparse files with writemap.
Perhaps unrelated, but mdb_copy, and the native 'cp' and 'tar' commands do handle sparse files files correctly.
--Quanah
--
Quanah Gibson-Mount Server Architect Zimbra, Inc.
Zimbra :: the leader in open source messaging and collaboration
* Quanah Gibson-Mount:
Last time I tried on BSD, it didn't use sparse files with writemap like Linux does. It just made a massive file on disk of the size specified for the maxsize. I.e., BSD didn't correctly support sparse files with writemap.
Even on Linux, sparse files which are filled incrementally can result in lots of fragmentation, huge extent lists (tens of thousands of entries longs), and long delays when opening such files.
Florian Weimer wrote:
- Quanah Gibson-Mount:
Last time I tried on BSD, it didn't use sparse files with writemap like Linux does. It just made a massive file on disk of the size specified for the maxsize. I.e., BSD didn't correctly support sparse files with writemap.
Even on Linux, sparse files which are filled incrementally can result in lots of fragmentation, huge extent lists (tens of thousands of entries longs), and long delays when opening such files.
It depends on the filesystem, not the operating system. The original BSD FFS supported sparse files just fine.
On 08/28/2014 07:24 AM, Florian Weimer wrote:
Even on Linux, sparse files which are filled incrementally can result in lots of fragmentation, huge extent lists (tens of thousands of entries longs), and long delays when opening such files.
The only "hole" is at the end. So this shouldn't be different from just write()ing at the end of a non-sparse file: lmdb uses new file pages in the same order either way.
Except with put(,,data size >= 2 pages, MDB_RESERVE) and the user filling in the item from the end forward. Then there will temporarily be a hole in the middle of the item. I suppose if the user fills the item in slowly enough for the OS to fsync, the file will get fragmented.
I suppose mdb_page_alloc() with WRITEMAP could memset new file pages, or at least set one word in each new OS page. I expect user programs usually fill in MDB_RESERVE items quickly though, so hopefully it won't matter.
openldap-technical@openldap.org