Ulrich Windl wrote:
Martin Raiber <martin(a)urbackup.org> schrieb am
19.01.2021 um 19:26 in Nachricht <010201771be5823e-1014e8bb-0f79-4506-b27e-d08d25400b1e-000000(a)eu-west-1.amazonse
.com>:
On 27.10.2020 18:52 Howard Chu wrote:
- LMDB causes crashes if database is corrupted
You can enable per-page checksums in LMDB 1.0, in which case you'll just
get an error code
if a page is corrupted (and the checksum fails to match). The DB will still
be unusable if
anything is corrupted.
That would fix the problem properly. Does it check that it is the correct
transaction as well (e.g. by putting a transid into the page like btrfs)? Returning
wrong results or MDB_CORRUPTED is something my application can handle (but
not crashes obviously).
The txnid is part of the page header, which is one of the incompatible
format changes from LMDB 0.9.
This is what allows us to eliminate the dirty bit.
Not sure what you mean about the txnid being correct or not, but certainly
it is included in the
checksum.
A common problem that e.g. btrfs users encounter is that a disk drops
If "a disk drops some writes" it's definitely not a problem of BtrFS. Dor you mean "BtrFS drops some writes"? I don't get it.
It was meant as disk drops some writes and btrfs users notice it because it does this checksumming + transid check (and ask online for help about their now broken btrfs because it doesn't have good repair tools).
See here for a btrfs user testing disks for this problem: https://lore.kernel.org/linux-btrfs/20190624052718.GD11831@hungrycats.org/T/
W.r.t. to LMDB in e.g. LDAP you could say "don't use broken disks then". But as a general purpose database (with checksumming) it would be something nice to have.
some writes. If there was a page at the same location previously the
checksum check succeeds. But btrfs stores the transid of the page in the page's parent, so it compares that as well (The error message is "btrfs parent transid verify failed on OFFSET wanted TRANSID found TRANSID"). I think ZFS stores the checksum of the page in the page's parent as well (idk if this would work with lmdbs b-tree).
I guess a simple check (that might already exist) is checking if page transid<=root/meta page transid. But that doesn't catch the cases where the root page was updated, but updates to other pages were dropped (the disk might also drop a complete "transaction" but not report an error, in which case the next transaction then writes the root page pointing to an incompletely written tree).