--On Saturday, March 05, 2011 5:05 AM -0800 Howard Chu hyc@symas.com wrote:
I've been working on a new "in-memory" B-tree library that operates on an mmap'd file. It is a copy-on-write design; it supports MVCC and is immune to corruption and requires no recovery procedure. It is not an append-only design, since that requires explicit compaction, and also is not amenable to mmap usage. Also the append-only approach requires total serialization of write operations, which would be quite poor for throughput.
My experience with back-(bdb/hdb) and syncrepl was the only reliable way to ensure consistent replication was to use delta-syncrepl which... serializes write operations. In fact, not forcing serialized writes for back-(bdb/hdb) was slower than serializing things, because of all the contention in the database. I understand this may not hold true for back-mdb, but thought I would note that currently our best performance is already achieved by serialization, write-wise.
re: configuring the size of the DB file - this is most likely not a value that can be changed on an existing DB. I.e., if you configure a DB and find that you need to grow it later, you will probably need to slapcat/slapadd it again. At DB creation time the file is mmap'd with address NULL so that the OS picks the address, and the address is recorded in the DB. On subsequent opens the file is mmap'd at the recorded address. If the size is changed, and the process' address space is already full of other mappings, it may not be possible to simply grow the mapping at its current address. Since the DB records contain actual memory pointers based on the region address, any change in the mapping address would render the DB unusable.
How exactly does the DB file size for back-mdb relate to the existing size of the database? Do they have to match? I.e., is this more like the DB_CONFIG cachesize, which can be more or less than the database size, or are they supposed to be an exact match? We have plenty of customers who have databases that are certainly not static in size. Particularly if you are using an accesslog databases for delta-syncrepl or other operations.
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration