Re: Antw: [EXT] Re: Use-case specific changes to LMDB

22 Jan 2021

      Ulrich Windl wrote:
...
...
...
...
Martin Raiber <martin(a)urbackup.org&gt; schrieb am
19.01.2021 um 19:26 in Nachricht
<010201771be5823e-1014e8bb-0f79-4506-b27e-d08d25400b1e-000000(a)eu-west-1.amazonse
.com>:
...
On 27.10.2020 18:52 Howard Chu wrote:
...
...
...
...

LMDB causes crashes if database is corrupted

You can enable per-page checksums in LMDB 1.0, in which case you'll just
get 
 an error code
...
...
...
if a page is corrupted (and the checksum fails to match). The DB will still
be unusable if
...
...
...
anything is corrupted.
That would fix the problem properly. Does it check that it is the correct
transaction as well (e.g. by putting a transid into the page like btrfs)? 
 Returning
...
...
wrong results or MDB_CORRUPTED is something my application can handle (but
not crashes obviously).
...
...
The txnid is part of the page header, which is one of the incompatible
format changes from LMDB 0.9.
...
...
This is what allows us to eliminate the dirty bit.
Not sure what you mean about the txnid being correct or not, but certainly
it is included in the
...
...
checksum.
A common problem that e.g. btrfs users encounter is that a disk drops
If  "a disk drops some writes" it's definitely not a problem of BtrFS. Dor
you mean "BtrFS drops some writes"?
I don't get it.
It was meant as disk drops some writes and btrfs users notice it because it does this checksumming + transid check (and ask online for help about their now broken btrfs because it doesn't have good repair tools).
See here for a btrfs user testing disks for this problem: https://lore.kernel.org/linux-btrfs/20190624052718.GD11831@hungrycats.org/T/
W.r.t. to LMDB in e.g. LDAP you could say "don't use broken disks then". But as a general purpose database (with checksumming) it would be something nice to have.
...
...
some writes. If there was a page at the same location previously the
checksum check succeeds. But btrfs stores the transid of the page in the 
 page's parent, so it compares that as well (The error message is "btrfs 
 parent transid verify failed on OFFSET wanted TRANSID found TRANSID"). I 
 think ZFS stores the checksum of the page in the page's parent as well 
 (idk if this would work with lmdbs b-tree).
I guess a simple check (that might already exist) is checking if page 
 transid<=root/meta page transid. But that doesn't catch the cases where 
 the root page was updated, but updates to other pages were dropped (the 
 disk might also drop a complete "transaction" but not report an error, 
 in which case the next transaction then writes the root page pointing to 
 an incompletely written tree).

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

Re: Antw: [EXT] Re: Use-case specific changes to LMDB