LMDB nosync with write order preserving filesystem

18 Aug 2020


      The documentation of MDB_NOSYNC says:
If the filesystem preserves write order and the MDB_WRITEMAP flag
    is not used, transactions exhibit ACI (atomicity, consistency,
    isolation) properties and only lose D (durability).
In practice, what file system + options preserve write order?
Asked this question elsewhere from Howard. I got the answer that ZFS
should do it, and ext4 with data=ordered _may_ do it. It seems to me
that ext4 with data=journal should be a very safe bet, too, would it
not? Are there any other recommendations?
I ran a few microbenchmarks to compare ext4 data=ordered and
data=journal. With the default sync, they can do about 600 and 400
write txn/s. With nosync + an mdb_env_sync() every second, they are
both at about 200k txn/s. For reference, the system can do about 5
million read txn/s. That makes me hopeful that ext4 with data=journal
could be a good option.
Cheers,
Gábor Melis

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

LMDB nosync with write order preserving filesystem