--On Wednesday, March 18, 2009 3:01 AM -0700 Howard Chu <hyc(a)symas.com>
wrote:
> Quanah Gibson-Mount wrote:
>> A number of our clients have requested "fail-over"/redundancy
>> capabilities for the LDAP master, and as I'm currently working on moving
>> our product to use OpenLDAP 2.4, this becomes a distinct possibility.
>> However, I have some questions about the
>> viability/reliability/effectiveness of using multiple masters combined
>> with replicas. I don't see these answered in the Admin Guide.
>
> You mean, setting up regular read-only replicas slaved to the masters?
Yes.
>> I'll start with replication under MMR.
>
>> As I understand it, the replicas can only point at a single master.
>
> False.
Last I checked, the provider parameter in the syncrepl configuration block
was single valued. Which means either it gets pointed at a load balancer,
or it only talks to a single master. If that master goes down, it no
longer talks to anything without reconfiguration.
>> So, if
>> I have a 2 master MMR setup, I assume I would want to point half my
>> replicas at master A and the other half at master B for their updates.
>> This leads to a problem in my mind, in that if master A goes down, then
>> half of my replica pool is now going to remain completely out of sync
>> with the remaining master until master A is recovered. Throwing a Load
>> balancer in front of the two masters, and pointing the replicas at that
>> instead, is not a viable option because the two masters may be getting
>> updates in a different sequence, so if a replica disconnects from the LB
>> and then reconnects, the updates it could get fed from whatever master
>> the LB is pointing at could lead to inconsistencies.
>
> What inconsistencies? Each master's changes are stamped with its own sid.
> Any consumer is going to know about the contextCSNs of each master it
> talks to.
The consumer can only talk to a single master, unless it is pointed at a
load balancer. See notes above. In the load balancer case then, I'm
assuming based on what you say, that if the replica gets disconnected
because master A went down, then when it gets reconnected to master B, that
its CSN will be completely different because of the SID values being
different, and do a full refresh. Is that correct?
>> Neither of these seem like a
>> good option. I don't see a good solution here to resolve this issue,
>> either, unless the replica could somehow know which master it had been
>> talking to,
>
> The replica always knows which master it's talking to...
>
>> and drop into refresh mode if it found itself talking to a new
>> master?
>
> Drop into refresh mode? Obviously in persist mode the consumer keeps a
> connection open to a specific master; a load balancer can't move an open
> connection. So obviously, if a particular master disappears, all of its
> clients are going to lose their connections and any consumers set up to
> retry are going to have to initiate new sessions. And every new
> replication session starts with a refresh phase. So this recovery is
> already automatic, it always has been.
Based off of the CSN value. See question above about CSN & SID.
>> I'm also not clear on what happens if your replicas are
>> delta-syncrepl based, rather than normal syncrepl, in the LB setup.
>
> Not possible. Current delta-sync requires all updates to be logged in
> order; in an MMR setup you can't guarantee order so *nobody* can use
> delta-sync in this scenario.
>
>> For Mirror Mode, I would assume you could point the replicas at the LB
>> fronting the two masters, since only one master is ever receiving
>> changes. I also assume delta-syncrepl would be a completely valid option
>> for replication to the replicas, again because only one master is
>> getting the updates, so all updates would be logged in the same sequence
>> on both servers. However, I don't know if this is correct or not, or if
>> there are limitations here I haven't considered. When I was first
>> pondering this on the #openldap-devel channel in IRC, Matt Backes made a
>> comment about delta-syncrepl not working with Mirror Mode.
>
> For MirrorMode, delta-sync should work since there is only ever one
> source of changes, and they will be logged in order. There is a window of
> vulnerability where a server crashes after committing changes to its
> accesslog, before it replicates them to the mirror. Those changes will be
> temporarily lost, and create a gap in the mirror's log. When the original
> server comes back up, the mirror will receive those lost changes, but the
> strict ordering of its log will be broken. In this case though, the
> delta-sync consumer will be fine - if the lost changes caused no
> conflicts, they will simply be committed. If they do cause a conflict,
> the consumer will just fallback to refresh mode and the conflicts will be
> erased.
Ok, good. Mirror Mode sounds like the way to go.
>> So, basically, I'm at a loss if my understanding things is correct, on
>> how I provide a consistent replicated environment for my customers,
>> while also providing master/master failover.
>
> This appears to have been a -software question, not a -devel question.
> Perhaps you should summarize back to the -software list and end this
> thread here.
Done.
--
Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc
--------------------
Zimbra :: the leader in open source messaging and collaboration