On 05/22/2015 04:05 PM, Dominik Taborsky wrote:
Hello,
I'm playing around with LMDB and I'd like to know more about how it deletes items and reclaims space. I've been having some problems with this lately.
For simplicity I'm testing smaller 20MiB DBs with data values of sizes from 40B to 4000B. The test checks all kinds of stuff, but mainly tries to do a few fillups and flushes in short sequence. The first cycle slowly increases the size of the data until the DB is full, then it flushes. Then 10 cycles of fillup-flush of static-sized data follow. For flushing I've been doing both mdb_drop and removing batches of stored data. I've tried batches of sizes between 2 and 100 items. The results differ very much, depending on all these parameters: (only the 10 equal cycles counted):
I'd like to add that I tested this with LMDB 0.9.14 and git version from Tue 10 Feb 2015 (from AUR).
For the source code part, that's a bit harder - LMDB is wrapped in our library, which generalizes the API for transparent use with another backend. It also adds aforementioned fillup check and (optional) flushing-per-batches. But to at least outline the use:
size_t size = 300; for (; ret == KNOT_EOK && key < 5000; ++key) { data d; random_data(&d, key, size); size *= 1.5; ret = db_store_data(db, &d); data_clear(&d); db_flush(db); }
db_close(db); int i, count; size = 4000;
for (i = 0; i < 10; ++i) { ret = KNOT_EOK; count = 0; for (; ret == KNOT_EOK && key < 5000; ++key) { data_t d; random_data(&d, key, size); j = db_open(path, mapsize); ret = db_store_data(db, &d); if (ret == KNOT_EOK) ++count;
db_close(db); data_clear(&d); }
//test the insert ok(count > 0, "db: pass #%d fillup run", i + 1);
db = db_open(path, mapsize); db_flush(db); db_close(db); }
I've edited the code a bit to make it more readable (it's an excerpt from a unittest) and clear. The fillup protection code is here:
MDB_stat stat; int ret = mdb_stat(txn->txn, env->dbi, &stat); if (ret != MDB_SUCCESS) { return lmdb_error_to_knot(ret); } MDB_envinfo envinfo; ret = mdb_env_info(env->env, &envinfo); if (ret != MDB_SUCCESS) { return lmdb_error_to_knot(ret); } /* Guarantee there is enough space for erasing records from the DB. */ size_t used_pages = stat.ms_branch_pages + stat.ms_leaf_pages + stat.ms_overflow_pages; size_t data_pages = val->len / stat.ms_psize; size_t total_pages = envinfo.me_mapsize / stat.ms_psize; if (used_pages + data_pages + CLEAR_PAGE_NO >= total_pages) { return EBUSY; }
Where CLEAR_PAGE_NO is currently defined to be 32.
If I could be of any further help to figure this out, let me know.
Thank you.
Best regards, Dominik Taborsky