Luke Kenneth Casson Leighton wrote:
> Not surprising. Remember, I've been writing
> since the 1980s. There is no other KV store that is full ACID and anywhere
> near as small or as fast as LMDB.
... or does range as well... *and* duplicate values per key. you
have a long way to go before beating tdb for number of lines of code
though :) written by rusty russell, andrew tridgell and others - did
you see it does hash buckets and then spin-locks (file-based) on each
hash chain, so you can have multiple simultaneous writers not just
readers? good trick that, i mention it just in case it's something
that could be deployed to good effect in lmdb, that would be awesome
to have parallel writes speeded up as well per core.
... yes i realise that lmdb is read-optimised, but hey its being
adopted elsewhere as well
Fair enough, tdb is smaller, but it's also missing all those other goodies
that we couldn't live without in OpenLDAP. And it doesn't perform.
Something that the world needs to understand - there is no such thing as
parallel writes, not in a transactional database. Every system out there gets
serialized sooner or later - most just get serialized in their WAL. All the
fine grained locking they do ahead of that is just mental masturbation ahead
of that inevitable bottleneck. You should notice there are plenty of
benchmarks where LMDB's write performance is far faster than so-called
write-optimized databases too. When you jettison all the overhead of those
fine-grained locks your write path can get a lot faster, and with LMDB's
zero-copy writes they go faster still.
We fell for the fantasy of parallel writes with BerkeleyDB, but after a dozen+
years of poking, profiling, and benchmarking, it all becomes clear - all of
that locking overhead+deadlock detection/recovery is just a waste of
resources. back-mdb isn't just faster than back-bdb/hdb for reads, it's also
several times faster for writes, and the absence of ubiquitous locks is a good
part of that.
>> i wrote some python evaluation code that stored 5,000 records
>> 8-byte keys and 100-byte values before doing a transaction commit: it
>> managed 900,000 records per second (which is ridiculously fast even
>> what gives, there? the benchmarks show that this is supposed
>> faster (a *lot* faster) and that is simply not happening. is the
>> overhead from python that large it wipes out the speed advantages?
> No idea. I don't use python enough to have any insight there.
But these folks have some thoughts on it
ok.. is there some c-based benchmark code somewhere i can check
to do sequential writes, compare it with the python bindings? just to
make sure. it is very puzzling that there's a slow-down rather than a
All of the source code for the microbenchmarks is linked from the microbench
page. (Minus the actual LevelDB source tree, which you also need.)
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/