mdb database protected against any power cut? - openldap-technical - openldap.org

List overview All Threads
Download

mdb database protected against any power cut?

BDB: Converting mmapped files to...

Re: OpenLDAP 2.4.32 available

Francois Gnu

24 Aug 2012 24 Aug '12

4:54 p.m.

Hello @ll!

Is the mdb database protected against any power cut?

Is the mechanism of recovery mdb database working fine?

Thank you very much

Librement, ------ Francois Trachez (kiko) Team Fedora|Lyon (France) http://stg.fedoraproject.org/fr/ http://stg.fedoraproject.org/es/

Reply

Show replies by date

Hallvard Breien Furuseth

24 Aug 24 Aug

6:08 p.m.

Francois Gnu writes:

Is the mdb database protected against any power cut?

Yes, if the filesystem and operating system are so protected and you don't use mdb's dbnosync option.

Is the mechanism of recovery mdb database working fine?

MDB needs no recovery. You may lose the last commit or two, but the database will be consistent if the filesystem is correct.

However, note that your OS and harddisk may both be lying when they claim they've really written your data. MDB does not attempt to deal with data which got broken that way. Some databases do try.

The OS part should be tunable with OS/filesystem parameters. Basically fsync() or fdatasync() need to actually write to disk. Harddisk writes get secured with builtin batteries to keep the disk going long enough to save the cache, RAID/SAN setups and I don't know what else. (I'm way out of date with this stuff.)

Explanation:

A database commit needs to be flushed to disk, and the data involved needs to be written in the right order (data before metadata). Which needs of physical seeks on the disk, and doing a lot of those can slow down not just the program doing it but the entire system - since other disk I/O may need to wait for these seeks too.

To speed things up, normal file operations just write to caches in the OS and can return success before the cache has been written to disk. MDB calls fdatasync() to flush this to disk before returning success.

But so do plenty of other programs, which can slow the system right back down. So to speed things up again, fdatasync() may ALSO be lying. And probably your harddisk too. It caches written data and returns success which actually just means "Got it, this'll get written".

-- Hallvard

Reply

Hallvard Breien Furuseth

6:22 p.m.

I wrote:

Francois Gnu writes:

...
Is the mechanism of recovery mdb database working fine?

MDB needs no recovery. You may lose the last commit or two, but the database will be consistent if the filesystem is correct.

Whoops - to clarify: You may lose the _last_ one or two commits, until they get flushed to disk. The one in progress, and maybe the last successful commit. Not just any random commit:-)

You can set up checkpointing to limit how long the last successful commit can be endangered. ...unless I read the code wrong. The manpage says "checkpoint" is only needed with "dbnosync". Hmm.

-- Hallvard

Reply

Francois Gnu

7:03 p.m.

Thank you very much for these well detailed answers.

Librement, ------ Francois Trachez (kiko) Team Fedora|Lyon (France) http://stg.fedoraproject.org/fr/ http://stg.fedoraproject.org/es/

2012/8/24 Hallvard Breien Furuseth h.b.furuseth@usit.uio.no:

I wrote:

...
Francois Gnu writes:

...
Is the mechanism of recovery mdb database working fine?

MDB needs no recovery. You may lose the last commit or two, but the database will be consistent if the filesystem is correct.

Whoops - to clarify: You may lose the _last_ one or two commits, until they get flushed to disk. The one in progress, and maybe the last successful commit. Not just any random commit:-)

You can set up checkpointing to limit how long the last successful commit can be endangered. ...unless I read the code wrong. The manpage says "checkpoint" is only needed with "dbnosync". Hmm.

-- Hallvard

Reply

Howard Chu

8:09 p.m.

Hallvard Breien Furuseth wrote:

I wrote:

...
Francois Gnu writes:

...
Is the mechanism of recovery mdb database working fine?

MDB needs no recovery. You may lose the last commit or two, but the database will be consistent if the filesystem is correct.

Whoops - to clarify: You may lose the _last_ one or two commits, until they get flushed to disk. The one in progress, and maybe the last successful commit. Not just any random commit:-)

You can set up checkpointing to limit how long the last successful commit can be endangered. ...unless I read the code wrong. The manpage says "checkpoint" is only needed with "dbnosync". Hmm.

In the default mode (fully synchronous) you can only lose the in-progress transaction. With dbnosync you can lose whatever hasn't been checkpointed yet. There is another mode (recently added, not yet exposed in back-mdb) that only flushes the data, not the metadata. With that mode, you might lose the in-progress transaction and the immediately preceding commit. This mode is slightly faster than fully-synchronous, and slightly slower than fully asynch.

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Reply

Hallvard Breien Furuseth

25 Aug 25 Aug

1:28 p.m.

Howard Chu writes:

In the default mode (fully synchronous) you can only lose the in-progress transaction.

Oh duh, I forgot the sync filehandle.

With dbnosync you can lose whatever hasn't been checkpointed yet.

I thought dbnosync could cause database corruption: Data and metadata writes can get flushed out of order to the disk, leaving the meta page temporarily pointing at garbage - at which time an OS crash or hardware failure can make the inconsistency permanent. Frequent checkpoints merely reduce the chance such a crash has of catching the DB in an inconsistent state. (OTOH a mere slapd crash is no problem.)

There is another mode (recently added, not yet exposed in back-mdb) that only flushes the data, not the metadata. With that mode, you might lose the in-progress transaction and the immediately preceding commit. This mode is slightly faster than fully-synchronous, and slightly slower than fully asynch.

That's like if we compiled with -DMDB_DSYNC=0?

-- Hallvard

Reply

4695

Age (days ago)

4696

Last active (days ago)

openldap-technical@openldap.org

5 comments

3 participants

tags (0)

participants (3)

Francois Gnu
Hallvard Breien Furuseth
Howard Chu