On 12/14/06, Steven G. Harms stharms@cisco.com wrote:
Replies inline:
On Wed, Dec 13, 2006 at 07:36:22PM -0500, Aaron Richton wrote:
Use of slurpd certainly isn't a modern architecture, especially with concern for rapid DR, for reasons including your question at hand. Look into syncrepl (make sure to upgrade to 2.3.30 first) and mirrormode.
Due to reasons beyond my control, I'm currently at OpenLDAP version slapd 2.2.13 (Aug 18 2005 22:22:34). As such syncrepl, based on your versioning guideline below isn't an option.
I would argue that the way to deal with HA/DR with OpenLDAP is to install enough slaves that you feel the odds of complete failure are tolerable, ....
I have built replicas in different data centers etc. So I'm going for a 'multiple replica' theory of backup. The DR need not be instant, nor does it need to be perfectly synchronized.
In light of these constraints, would it be appropriate to use slurpd for replication?
Assuming slurpd for replication, what are the answers to these questions:
Can I assign multiple replicas
How can I clean out the replog back-queue to a pristine start?
I suppose, more generally, I'm asking: How do a start replication all
over. What files can / should I delete. Which should I not under any circumstance touch?
Should i delete the slurpd.replog.lock file or/and the slurpd.replog? Should i just cat /dev/null into them? How can I start over? And further, how do i rotate the replog as it starts to eat up more filesystem.
For the record I'm using a BDB back end.
While I am also planning on a switch to syncrepl, I can only make so many changes at a time, so I'm stuck in this boat with you for a while. I found at the NYC BSD CON that openldap HA is a very common questions for people, so here's a rough outline of how I told people to use slurpd for HA. (after saying how much easier syncrepl was, of course)
Run at least two instances of slapd on your master server. This allows you to build new replicas without much downtime of existing replicas or the master. You can start backlogging a new replica from a point-in-time slapcat of another downed replica.
Use clustering (like veritas on a san, bdb doesn't like nfs if I remember correctly) to manage failing over the --entire-- database to another physical server if that one craps out.
If you can't use a cluster, or your san fails, or whatever, you can also promote one of your replicas to be a master by moving over the config files and, if needed, add the IP from your previous master. When you fail over the slurpd configs, remove the old master since you will have to rebuild it from scratch. You can move the old replication logs over and run them in one-shot mode on the new master before you start it up. That should help keep you in-sync. You can also manually edit the replog to change the server ip, if needed.
Now, rebuild your master from scratch. Now shutdown and slapcat a replica using well-timed slurpd config changes to keep everything in-sync, and then follow the above procedures for promoting it back to the master.