Luke Kenneth Casson Leighton wrote:
Not surprising. Remember, I've been writing world's-fastest <whatevers> since the 1980s. There is no other KV store that is full ACID and anywhere near as small or as fast as LMDB.
... or does range as well... *and* duplicate values per key. you have a long way to go before beating tdb for number of lines of code though :) written by rusty russell, andrew tridgell and others - did you see it does hash buckets and then spin-locks (file-based) on each hash chain, so you can have multiple simultaneous writers not just readers? good trick that, i mention it just in case it's something that could be deployed to good effect in lmdb, that would be awesome to have parallel writes speeded up as well per core.
... yes i realise that lmdb is read-optimised, but hey its being adopted elsewhere as well
Fair enough, tdb is smaller, but it's also missing all those other goodies that we couldn't live without in OpenLDAP. And it doesn't perform.
Something that the world needs to understand - there is no such thing as parallel writes, not in a transactional database. Every system out there gets serialized sooner or later - most just get serialized in their WAL. All the fine grained locking they do ahead of that is just mental masturbation ahead of that inevitable bottleneck. You should notice there are plenty of benchmarks where LMDB's write performance is far faster than so-called write-optimized databases too. When you jettison all the overhead of those fine-grained locks your write path can get a lot faster, and with LMDB's zero-copy writes they go faster still.
We fell for the fantasy of parallel writes with BerkeleyDB, but after a dozen+ years of poking, profiling, and benchmarking, it all becomes clear - all of that locking overhead+deadlock detection/recovery is just a waste of resources. back-mdb isn't just faster than back-bdb/hdb for reads, it's also several times faster for writes, and the absence of ubiquitous locks is a good part of that.
i wrote some python evaluation code that stored 5,000 records with 8-byte keys and 100-byte values before doing a transaction commit: it managed 900,000 records per second (which is ridiculously fast even
what gives, there? the benchmarks show that this is supposed to be faster (a *lot* faster) and that is simply not happening. is the overhead from python that large it wipes out the speed advantages?
No idea. I don't use python enough to have any insight there.
But these folks have some thoughts on it
https://twitter.com/hyc_symas/status/451763166985613312
ok.. is there some c-based benchmark code somewhere i can check how to do sequential writes, compare it with the python bindings? just to make sure. it is very puzzling that there's a slow-down rather than a speed-up.
All of the source code for the microbenchmarks is linked from the microbench page. (Minus the actual LevelDB source tree, which you also need.) http://symas.com/mdb/microbench