Florian Weimer wrote:
- Michael Ströder:
Florian Weimer wrote:
Multiple concurrent writers are nice on paper, but probably are not worth the complexity for an in-process database.
Your statement sounds a bit like "640 kByte RAM is enough for everybody" or similar famous misunderstandings in the IT history already proven to be false.
E.g. my Seamonkey and LibreOffice use the same cert/key database concurrently without accessing a separate service.
I think there is a misunderstanding what I meant with "concurrent". What I wanted to say is that Berkeley DB supports (in some cases at least) that process 1 updates the values at keys A1, A2, A3, while, at the same time, process 2 updates the values at keys B1, B2, B3. That is, both transactions make progress at the same time, executing in a interleaved manner (even physically, on multi-processor machines). This is different from multiple processes opening the database for writing, and obtaining an exclusive lock before updating it, although observable behavior will not differ much if the transactions are short, the working set fits into memory etc.
But the sad truth is that you cannot avoid in all cases that writers block writers. Two transactions might update the same values. Or there are secondary indexes that need updating. Or you have page-level locking (like the B-trees in Berkeley DB) and the keys all happen to be on the same page. You even have to write your application code such that it retries transactions in case of deadlocks (IIRC, this is necessary with Berkeley DB even in the case of one writer and multiple readers). And concurrent, interleaved execution is particularly interesting if the transactions execute for some time, but this makes it all the more likely that anything triggers a conflict. Berkeley DB does not offer row-level locking for B-trees, and the transaction size is limited by the lock space by page-level locks, so it is particularly prone to aborting transactions once they grow larger. And yet Berkeley DB's transactional data store is tremendously complex and difficult to package in a friendly way for end users, with automated recovery and upgrades across major Berkeley DB versions.
This is why I think the Berkeley DB approach is a poor trade-off.
Totally agreed. In practice, concurrent writers in BDB were always deadlocking and needing to abort/retry.
IMO a lot of DBUS service stuff or similar cruft could be avoided by using concurrent writing of different processes to a single embedded DB.
IPC has the advantage that it isolates the database from application crashes.
If you have a robust DB design, application crashes won't have any impact on other users of the DB.