On 8/12/15 7:29 AM, Andrew Findlay wrote:
On Mon, Aug 10, 2015 at 05:19:22PM -0700, Brian Wright wrote:
> We're trying to solve the problem of how to recover/replace a failed
> node in a system containing a very large number of records and bring
> it back into the cluster as quickly as possible. We're also trying
> to resolve how to ensure that replication works consistently on
> restart.
In terms of recovering a failed node, the very fastest method is to use
a database backup made with mdb_copy. The output from that command is
a file that can be used directly as an MDB database so all you have to do
is put it in place and restart slapd. Even if the backup is a day or two
old, the replication process should bring in the more recent changes
from another server.
[...]
There are some caveats with mdb_copy. In particular it can cause
database bloat if run on a server that has a heavy write load at the
time.
Andrew
Thanks for the tip. This really helps us a lot with recovering failed
nodes. I wouldn't have thought to dig into the libraries/liblmdb area
looking for tools. The library yes, but tools no. I had assumed there
must be tools somewhere, but since they didn't get installed with the
regular package, I didn't know where they were. I guess I should always
be more investigative and look through all of the directories of the
source. :)
As for the use of our environment, our LDAP traffic will mostly consist
of reads with a much small numbers of writes throughout the day. So, our
workload should probably not cause much bloat, if any, as long as we're
judicious with the tool usage. Though, I will make note of this aspect
when I write the docs for our use of the copy tool.
Thanks again.
--
Signature
*Brian Wright*
*Sr. UNIX Systems Engineer *
901 Mariners Island Blvd Suite 200
San Mateo, CA 94404 USA
*Email *brianw(a)marketo.com <mailto:brianw@marketo.com>
*Phone *+1.650.539.3530**
*****www.marketo.com <
http://www.marketo.com/>*
Marketo Logo