Mark Zealey wrote:
- Creating database with non-sequential keys is very bad (on 4gb
databases, 2* slower than kyoto - about 1h30 and uses more memory).
This was actually a typo - kyoto only takes about 20 minutes to generate it so 4* slower. However, using a commit every 1m inserts (and because
Try again with commit every 100K inserts.
of a limitation in the perl module also have to close the DB/env and reopen it) and backing it to a memdisk (which we also have to do with kyoto), it takes about 10% less time than kyoto. Doing it against normal disk is still very slow though. Size for a 4gb database was about 10% more than kyoto, for kyoto outputting a 1.5gb database lmdb did 2.5gb though. Doesn't matter too much for our purposes however.
Look at mdb_stat -ef on the resulting DB, you'll see that a large amount of pages claimed on disk are actually free pages in the DB. Larger commits leave more old pages behind than smaller commits.