Hi,
I've been going all documentation I can find (FAQ/bdb-docs..) and I still have some doubt whether I understand this correctly.
I run frequent dumps with slapcat to backup the database, but I still need to cleanup the BDB logfiles and it would also be nice get faster back online after a crash than you can from LDIF.
So I understand how to create a hot backup by copying the database files (db_archive -s) and then the log files (db_archive -l) and runing db_recover -c.
I can see that I can delete unused log files (db_archive [no options]) from the backup. But when is it safe to remove log files from the active environment? db_archive on the active environment lists fewer files than on the backup (predictable enough).
The docs say that running db_archive -d can make recovery impossible. OK... so I don't do that. But what is required of my hot backup snapshot to know that I can delete log files from the active environment? (and which?) and still not influence the posibility for recovery.
Could anyone list a step-by-step procedure to create a snapshot for backup and prune the log files from the active environment?
btw: openldap 2.3.30/ bdb 4.2.52 (debian) ... but I guess that's not so important here.
regards, Peter
PS: I expect still to do occasional slapcats just as an extra security measure.
Peter Mogensen wrote:
Hi,
I've been going all documentation I can find (FAQ/bdb-docs..) and I still have some doubt whether I understand this correctly.
I run frequent dumps with slapcat to backup the database, but I still need to cleanup the BDB logfiles and it would also be nice get faster back online after a crash than you can from LDIF.
So I understand how to create a hot backup by copying the database files (db_archive -s) and then the log files (db_archive -l) and runing db_recover -c.
I can see that I can delete unused log files (db_archive [no options]) from the backup. But when is it safe to remove log files from the active environment?
Let auto-archive do that for you.
db_archive on the active environment lists fewer files than on the backup (predictable enough).
The docs say that running db_archive -d can make recovery impossible. OK... so I don't do that. But what is required of my hot backup snapshot to know that I can delete log files from the active environment? (and which?) and still not influence the posibility for recovery.
Could anyone list a step-by-step procedure to create a snapshot for backup and prune the log files from the active environment?
btw: openldap 2.3.30/ bdb 4.2.52 (debian) ... but I guess that's not so important here.
regards, Peter
PS: I expect still to do occasional slapcats just as an extra security measure.
Gavin Henry wrote:
I can see that I can delete unused log files (db_archive [no options]) from the backup. But when is it safe to remove log files from the active environment?
Let auto-archive do that for you.
Ok.. then I guess, I'm confused by the repeated warnings like:
"To have them removed automatically, place set_flags DB_LOG_AUTOREMOVE directive in DB_CONFIG. Note that if the log files are removed automatically, recovery after a catastrophic failure is likely to be impossible."
and:
"Automatic log file removal is likely to make catastrophic recovery impossible."
How to remove log files without making catastropic recovery impossible ?
regards, Peter
--On Thursday, April 10, 2008 3:26 PM +0200 Peter Mogensen apm@mutex.dk wrote:
Gavin Henry wrote:
I can see that I can delete unused log files (db_archive [no options]) from the backup. But when is it safe to remove log files from the active environment?
Let auto-archive do that for you.
Ok.. then I guess, I'm confused by the repeated warnings like:
"To have them removed automatically, place set_flags DB_LOG_AUTOREMOVE directive in DB_CONFIG. Note that if the log files are removed automatically, recovery after a catastrophic failure is likely to be impossible."
and:
"Automatic log file removal is likely to make catastrophic recovery impossible."
How to remove log files without making catastropic recovery impossible ?
You can't. That's why you slapcat periodically so you have an alternative. Not that I personally have ever hit a catastrophic event in the years I've been using OpenLDAP where the log files would have helped in any way. I.e., disk failure? -- logs are useless if they were on the disk that failed. And in that case, I just slapcat a different replica and slapadd it onto the server with the failed disk once it has been replaced. I.e., I find having replicas a much more effective disaster recovery mechanism than log files that are susceptible to other problems.
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
Quanah Gibson-Mount wrote:
--On Thursday, April 10, 2008 3:26 PM +0200 Peter Mogensen apm@mutex.dk wrote:
How to remove log files without making catastropic recovery impossible ?
You can't. ... I.e., I find having replicas a much more effective disaster recovery mechanism than log files that are susceptible to other problems.
Ok... I get that. Replicas are good for recovery.
However I still would expect that there were some way to retire very old log files by creating a backup snapshot representing the state at some point in time and deleting all log files which decribed transactions included in that state.
...and then later do recovery with that snapshot and the newer log files from the active environment.
Are you saying that to be safe you have to keep the log.000000001 _for ever_ ?
If so, I think the FAQ needs an update.
regards, Peter
On Fri, 11 Apr 2008, Peter Mogensen wrote:
Are you saying that to be safe you have to keep the log.000000001 _for ever_ ?
If you want to guarantee that db_recover -c will function, then this is pretty close to how it works out in practice. See
http://www.oracle.com/technology/documentation/berkeley-db/db/gsg_txn/C/logf...
in particular, the requirement to keep allegedly "removed" log files in an offline backup. So you don't have to keep it in your live environment, but you have to keep it *somewhere*.
slapcat(8) has the advantage of making snapshots in a single-file, text-only format, without any database issues like transaction logs to worry about. Of course, a well-oiled, homogenous db_hotbackup installation *may* be able to start up faster than a slapadd in the event of a backup restore. Your Environment May Vary.
Aaron Richton writes:
On Fri, 11 Apr 2008, Peter Mogensen wrote:
Are you saying that to be safe you have to keep the log.000000001 _for ever_ ?
If you want to guarantee that db_recover -c will function, then this is pretty close to how it works out in practice. See
http://www.oracle.com/technology/documentation/berkeley-db/db/gsg_txn/C/logf...
I'm not sure how to reconcile that one with the advice in Database and log file archival http://www.oracle.com/technology/documentation/berkeley-db/db/ref/transapp/a... It says you only need archive the last logfile.
You need to stop slapd first though. Or sync to another slapd, then stop and backup that slapd's database.
Peter Mogensen wrote:
Hi,
I've been going all documentation I can find (FAQ/bdb-docs..) and I still have some doubt whether I understand this correctly.
All of your questions are answered in the official BDB docs.
http://www.oracle.com/technology/documentation/berkeley-db/db/ref/transapp/a...
These docs are for the current version, 4.6.21, but not much has changed in the description since 4.2.52.
I run frequent dumps with slapcat to backup the database, but I still need to cleanup the BDB logfiles and it would also be nice get faster back online after a crash than you can from LDIF.
So I understand how to create a hot backup by copying the database files (db_archive -s) and then the log files (db_archive -l) and runing db_recover -c.
No, "db_recover -c" is for recovering from a catastrophic failure. It's not for creating a backup.
I can see that I can delete unused log files (db_archive [no options]) from the backup. But when is it safe to remove log files from the active environment? db_archive on the active environment lists fewer files than on the backup (predictable enough).
From the BDB doc page above:
To minimize the archival space needed for log files when doing a hot backup, run db_archive to identify those log files which are not in use. Log files which are not in use do not need to be included when creating a hot backup, and you can discard them or move them aside for use with previous backups (whichever is appropriate), before beginning the hot backup. <<
The docs say that running db_archive -d can make recovery impossible. OK... so I don't do that. But what is required of my hot backup snapshot to know that I can delete log files from the active environment? (and which?) and still not influence the posibility for recovery.
Could anyone list a step-by-step procedure to create a snapshot for backup and prune the log files from the active environment?
The docs/ref/transapp/archival.html file that SleepyCat bundles in the BerkeleyDB installation provides all the steps.
btw: openldap 2.3.30/ bdb 4.2.52 (debian) ... but I guess that's not so important here.
Howard Chu wrote:
Peter Mogensen wrote:
Hi,
I've been going all documentation I can find (FAQ/bdb-docs..) and I still have some doubt whether I understand this correctly.
All of your questions are answered in the official BDB docs.
http://www.oracle.com/technology/documentation/berkeley-db/db/ref/transapp/a...
Well... yes... I suppose so, but as I said, they left me in doubt.
No, "db_recover -c" is for recovering from a catastrophic failure. It's not for creating a backup.
? When reading the docs it seems to me like db_recover -c is an integral part of making a hot backup??
From the BDB doc page above:
To minimize the archival space needed for log files when doing a hot backup, run db_archive to identify those log files which are not in use. Log files which are not in use do not need to be included when creating a hot backup, and you can discard them or move them aside for use with previous backups (whichever is appropriate), before beginning the hot backup. <<
Yes... but I thought that statement was contradicted by other statements like that deleting unused log files will make recovery impossible. What happens if your environment should crash after you have discarded these log files, but before you begin your hot backup ?
Could anyone list a step-by-step procedure to create a snapshot for backup and prune the log files from the active environment?
The docs/ref/transapp/archival.html file that SleepyCat bundles in the BerkeleyDB installation provides all the steps.
I've read that file several times and now again. Let me try to describe what I think I read then:
To perform a backup and prune unused logfiles from your active environment: ============= WARNING: Only my guess 1) Run "db_archive" on you active environment to identify unused log files. Copy them somewhere to keep while doing the backup. 2) Run "db_archive -s" to indentify database files and copy them to your backup location. 3) Run "db_archive -l" on you active environment to indentify all log files and copy them to your backup location. 4) Run "db_recover -c" on your backup to make it consistent. 5) Since the backup is offline you can safely delete the unused log files from it. ("db_archive -d") 6) The log files copied in step 1) can now safely be discarded so they don't exist anywhere - including the active environment.
Then it's my impression that in case the active environment should crash you should be able to continue from the backup + the logfiles from the active environment with minimal data loss ??? ============== End guess =========
I might have misunderstood a lot, but I tried to take into account all the hints and fragments of advice I've found in the docs to arrive at way to get rid (for good) of very old log files and only keeping backups of the dababases files with a few relevant logs. ... if this is at all possible?
regards, Peter
On Fri, 11 Apr 2008, Peter Mogensen wrote:
Howard Chu wrote:
...
No, "db_recover -c" is for recovering from a catastrophic failure. It's not for creating a backup.
? When reading the docs it seems to me like db_recover -c is an integral part of making a hot backup??
"db_recover -c" says "perform recovery using all of the txn log files that are present instead of only going back to the point named in the last checkpoint". When making a hot backup, you need to do that in case a checkpoint was taken between when you started the copy of the first database file and when the copy of the last txn log completed. That "catastrophic recovery" only needs to be performed on the txn log files that were copied as part of the hot backup and not txn log files that were archivable before the first database file was copied.
In theory, it would be possible to perform full catastrophic recovery of a database from *just* the txn log files starting at log.000000001 and _no_ database files...but that will probably take more time than you really are interested in spending. The whole point of backing up the database files is to make it unnecessary to save and process the txn log files whose contents have been completely checkpointed to the database files.
...
What happens if your environment should crash after you have discarded these log files, but before you begin your hot backup ?
Their contents have been checkpointed to the database files, so normal recovery is sufficient.
...
To perform a backup and prune unused logfiles from your active environment: ============= WARNING: Only my guess
- Run "db_archive" on you active environment to identify unused log files.
Copy them somewhere to keep while doing the backup.
These files are not needed in the backup itself. Indeed, they're only needed if any of the database files are lost or corrupted without also losing the txn log files. In my experience, the situations where these files are useful are better handled by recovering from a replica instead of trying to perform database level recovery.
(I once helped a site where a backplane failure managed to make fsync() lie such that a checkpoint completed without the data actually making it to disk for the database files. The txn log files were fine, so performing catastrophic recovery with they not-yet-archived txn logs was sufficient to fix the problem, but that's the *only* time, in 7 years of intensive commercial BDB usage, where I've seen a use for archivable txn log files.)
- Run "db_archive -s" to indentify database files and copy them to your
backup location. 3) Run "db_archive -l" on you active environment to indentify all log files and copy them to your backup location.
Do be sure to follow the BDB documentation regarding copying of the files. In particular, use dd instead of cp on Solaris (or write your own program that uses read() and not mmap()).
- Run "db_recover -c" on your backup to make it consistent.
- Since the backup is offline you can safely delete the unused log files
from it. ("db_archive -d") 6) The log files copied in step 1) can now safely be discarded so they don't exist anywhere - including the active environment.
Then it's my impression that in case the active environment should crash you should be able to continue from the backup + the logfiles from the active environment with minimal data loss ??? ============== End guess =========
Other than my comments above, this procedure looks good to me. *Do* be sure to test it, both by forging failures of various part of it (out of disk space during a copy?) and by actual making sure you have a tested procedure for restoring a back up.
Philip Guenther
openldap-software@openldap.org