Re: Upgrade to 2.3.40 -> failed index

4 Feb 2008


      On Mon, 4 Feb 2008, Howard Chu wrote:
...
That documentation is clearly obsolete, which is why it was removed.
slurpd is obsolete, which is why the section on slurpd was removed from the
2.4 manual. Considering OpenLDAP-2.3.39 is still marked as the stable
release, I can't really see that the 2.3 documentation in its entirety is
obsolete.
...
http://www.oracle.com/technology/documentation/berkeley-db/db/ref/transapp/a...
Ah, that is the section on backing up/restoring a database, which I suppose
could also be considered the same procedure to be used for copying a
database from one system to another. Given your original wording, I was
looking for something more specifically geared towards copying.
...
At a guess, you failed to copy the transaction log files to the slaves.
If I had failed to copy the transaction log files, I don't really see that
it would have worked at all; let alone for almost a year.
Reviewing the backup/restore procedure, I don't really see anything I might
have missed. slapd was not running during the copy, so clearly any updates
were suspended. In fact, slapd had never been run -- the copy was made
immediately after the initial slapadd. There were actually no log files
present. As I mentioned, I have bdb configured to automatically remove
them. Presumably slapadd explicitly/implicitly check pointed upon
completion and they were removed. Even if there was a log file that I
didn't see, the log files were stored in the same directory as the database
files, and I copied the entire directory.
...
...
Also, even if for some reason the copies on the two slaves were invalid,
that would not explain why the master failed. The database on the master
was the original database built by slapadd when the server was first put
into commission. How could making a copy of it have caused it to fail
itself?
Too difficult to guess, given the lack of information. We have only your
assurance that nothing was done incorrectly, but the facts indicate that at
least one step was done incorrectly.
The facts only indicate that I had a catastrophic failure. That the failure
was caused by incompetence is only a hypothesis.
I do greatly appreciate your response and willingness to help; I apologize
if I'm getting a bit defensive.
You do have only my assurance that I didn't screw something up. However,
assuming I'm not lying, the facts are:
* openldap 2.3.35 was initially installed on three servers
* on the master server, slapadd was run to load in an existing database
  in ldif format
* the resultant bdb database was then copied to both slaves
* all three were put into production March 2007 and ran perfectly
  under a reasonably heavy load
* a week or so ago I upgraded them to 2.3.40 (stop old server, install
  new server, start new server -- never touching bdb or the existing
  database files)
* they ran fine for at least 3-4 days
* this weekend, they died horribly
Given these facts, if something was done incorrectly, it does not seem
likely that it was failure to copy a transaction log file in March 2007. If
the failure was my own doing, it seems more likely a byproduct of the
upgrade, although I can't think of anything that I could have done wrong
during that process.
At this point, I guess I'll just write it off and hope it doesn't happen
again.
-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  henson@csupomona.edu
California State Polytechnic University  |  Pomona CA 91768

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: Upgrade to 2.3.40 -> failed index