I wanted to resurrect this thread:
Hi,
I understand that the DB size has an upper limit set by the call to mdb_env_set_mapsize . I wonder what is the best strategy for growing the size.
The best strategy is to initially pick a large enough size that growing it is never an issue. E.g., set it to the amount of free space on the disk partition where the DB resides. As the docs state, when the DB is actually in use, there's no guarantee that there will be enough free system resources left for a resize attempt to even succeed. I.e., if you initially choose a small size, by the time you need to deal with it it may be too late.
I have a problem with the "best strategy" because at least on OS X, the reserved file size of "data.mdb" is just the amount set by mdb_env_set_mapsize. Clearly reserving the amount of free space in my user partition would not be desirable, especially if I want to run more than just one lmdb based program.
So I tried to grow my db progressively, by setting mdb_env_set_mapsize first to a small value and the enlarging it. Unfortunately it seems, that whenever I increase the env_mapsize, all the committed data is lost and I start with a clean database (determined by mdb_stat).
So what gives ? The only thing I have come up with is, that I would need to copy the old database(s) into the new one with a cursor, but that sounds lame.
Ciao Nat! --------------------------------------------------- Tradition is an easy alternative to thinking what to do. -- J. Watkinson
On Feb 23, 2014, at 16.37, Nat! nat@mulle-kybernetik.com wrote:
I wanted to resurrect this thread:
Hi,
I understand that the DB size has an upper limit set by the call to mdb_env_set_mapsize . I wonder what is the best strategy for growing the size.
The best strategy is to initially pick a large enough size that growing it is never an issue. E.g., set it to the amount of free space on the disk partition where the DB resides. As the docs state, when the DB is actually in use, there's no guarantee that there will be enough free system resources left for a resize attempt to even succeed. I.e., if you initially choose a small size, by the time you need to deal with it it may be too late.
I have a problem with the "best strategy" because at least on OS X, the reserved file size of "data.mdb" is just the amount set by mdb_env_set_mapsize. Clearly reserving the amount of free space in my user partition would not be desirable, especially if I want to run more than just one lmdb based program.
So I tried to grow my db progressively, by setting mdb_env_set_mapsize first to a small value and the enlarging it. Unfortunately it seems, that whenever I increase the env_mapsize, all the committed data is lost and I start with a clean database (determined by mdb_stat).
So what gives ? The only thing I have come up with is, that I would need to copy the old database(s) into the new one with a cursor, but that sounds lame.
generally speaking, i’d discourage you from looking at that limit from the perspective of “how large will my data be?”. instead, consider it a safeguard, for the os/environment. evaluate your particular environment, and use values amongst your various instances such that, were something unexpected to happen, the entire disk/partition/etc is not consumed to the point of choking out the os [or perhaps other more important processes, etc].
-ben
Am 24.02.2014 um 04:21 schrieb btb@bitrate.net:
generally speaking, i’d discourage you from looking at that limit from the perspective of “how large will my data be?”. instead, consider it a safeguard, for the os/environment. evaluate your particular environment, and use values amongst your various instances such that, were something unexpected to happen, the entire disk/partition/etc is not consumed to the point of choking out the os [or perhaps other more important processes, etc].
-ben
I think this is valid, if you're thinking in terms of this is my database and this is my server where it runs on. I am more trying to use lmdb as a persistable hashtable, that I could put into a variety of my applications that me and other people would use. I have no idea beforehand, what the use is going to be on other peoples devices and most probably the other people wouldn't know either.
Currently I am making a clone of the environment and then create a new bigger environment and copy from small into big. This seems to work so far, but it's just doesn't feel right to me.
Ciao Nat! ------------------------------------------------------ Wir haben das Beste gewollt, aber es kam wie immer. - W.Tschernomyrdin
Nat! wrote:
Am 24.02.2014 um 04:21 schrieb btb@bitrate.net:
generally speaking, i’d discourage you from looking at that limit from the perspective of “how large will my data be?”. instead, consider it a safeguard, for the os/environment. evaluate your particular environment, and use values amongst your various instances such that, were something unexpected to happen, the entire disk/partition/etc is not consumed to the point of choking out the os [or perhaps other more important processes, etc].
-ben
I think this is valid, if you're thinking in terms of this is my database
and this is my server where it runs on. I am more trying to use lmdb as a persistable hashtable, that I could put into a variety of my applications that me and other people would use. I have no idea beforehand, what the use is going to be on other peoples devices and most probably the other people wouldn't know either.
Currently I am making a clone of the environment and then create a new
bigger environment and copy from small into big. This seems to work so far, but it's just doesn't feel right to me.
Certainly haven't seen the behavior you describe, but I seldom test on MacOS or HFS+. I would use FFS, since it supports sparse files.
On Windows, Linux, and FreeBSD, there's no problem increasing the mapsize and preserving the existing data.
Am 24.02.2014 um 15:56 schrieb Howard Chu hyc@symas.com:
Certainly haven't seen the behavior you describe, but I seldom test on MacOS or HFS+. I would use FFS, since it supports sparse files.
On Windows, Linux, and FreeBSD, there's no problem increasing the mapsize and preserving the existing data.
Maybe there isn't a problem with increasing the mapsize on OS X after all, because as soon as you wrote that, I couldn't reproduce the problem.... This would make everything a lot easier.
I tried my test program with a 4GB file. The available capacity goes down by 4 GB on OS X and it took quite some time to write the initial file, so I think the I'll stick with the growth approach for now.
Thanks for the help so far.
Ciao Nat! --------------------------------------------------------- Apple is the steam train that owns the tracks. - S. Jobs
Is this is a safe heuristic ? I query this self-written functionbefore every mdb_put to see if it might fail because of space problems. If it could fail, then I just commit the transaction, grow the database and keep on doing mdb_put's. It worked so far, but that doesn't mean much :)
Basically my guess is, that the size of the data and key with some overhead and some rounding are what is needed for transaction pages and data pages and I figure growing the btree would possibly add one more page:
/* (nat) this is supposed to be a conservative heuristic */ int mdb_can_put( MDB_txn *txn, MDB_dbi dbi, MDB_val *key, MDB_val *data, unsigned int flags) { MDB_env *env; size_t pgsize; size_t size; pgno_t pgno; int num;
env = txn->mt_env; pgsize = env->me_psize - PAGEHDRSZ; size = key->mv_size + 16 + data->mv_size + 16; /* 15 for alignment/overhead voodoo */ num = (size + pgsize - 1) / pgsize; /* size for data */
if( txn->mt_dirty_room < num) return( MDB_TXN_FULL);
pgno = txn->mt_next_pgno; num = num * 2 + 1; /* figure num data, num txn (?), one for btree expansion */ if( pgno + num >= env->me_maxpg) return( MDB_MAP_FULL); return( 0); }
Ciao Nat! --------------------------------------------------- Die Jugend von heute liebt den Luxus, hat schlechte Manieren und verachtet die Autorität. Sie wider- sprechen ihren Eltern, legen die Beine übereinander und tyrannisieren ihre Lehrer. -- Sokrates
openldap-technical@openldap.org