On Mon, Aug 10, 2015 at 05:19:22PM -0700, Brian Wright wrote:We're trying to solve the problem of how to recover/replace a failed node in a system containing a very large number of records and bring it back into the cluster as quickly as possible. We're also trying to resolve how to ensure that replication works consistently on restart.In terms of recovering a failed node, the very fastest method is to use a database backup made with mdb_copy. The output from that command is a file that can be used directly as an MDB database so all you have to do is put it in place and restart slapd. Even if the backup is a day or two old, the replication process should bring in the more recent changes from another server. [...] There are some caveats with mdb_copy. In particular it can cause database bloat if run on a server that has a heavy write load at the time. Andrew
Brian Wright |