Hello,
I am asked to design a replicated OpenLDAP implementation for use on 1500 of our customers servers who are now use a non-replicated configuration using the standard passwd/shadow backend combined with a PostgreSQL database. Our customers consist of primary schools and will use the database for authentication through Samba. The reason we want to replicate the data is so that we can offer email and other services from a central datacentre. Having considered several options ( Multi-Master, MirrorMode ), after some consulting in our team, I've decided we'll opt for the 'simple' Master-Slave setup, with a Master at each customer site and 7 (virtualised) servers each handling an everage of 215 customer databases. Having 7 servers however has the disadvantage of not knowing on which server a customer's database will be when a user is trying to authenticate on their email for example. My first thought was that I would configure a special 'redirection' server only containing referrals. I 'hoped' all common clients would cache these referrals so that load on this redirection server would be low. I'm now doubting this choice for 2 reasons:
- The following of this referrals seams highly unstandardized. The biggest users of the referral functionality will probably be PHP, in the PHP manual the documentation on rebinding / referral chasing is not very thorough. Any automation of this is also not present, and I would need to implement any caching myself too. ( http://www.php.net/manual/en/function.ldap-set-rebind-proc.php ) This is 'annoying' for me, but I highly doubt third party applications will all implement this in a reasonable way, and I was not planning on customizing all of the software we use...
- In the notes on the documentation of referrals for OpenLDAP 2.4 ( http://www.openldap.org/doc/admin24/referrals.html ), the following note is present: Note: the use of referrals to construct a Distributed Directory Service is extremely clumsy and not well supported by common clients. If an existing installation has already been built using referrals, the use of the chain overlay to hide the referrals will greatly improve the usability of the Directory system. A better approach would be to use explicitly defined local and proxy databases in subordinate configurations to provide a seamless view of the Distributed Directory.
Though I am usually very stubborn, I try to avoid designing systems in a way the documentation says is not recommended.
The use of 1 single proxy cache server seams to 'ease the pain' a bit, but does not seam like a very scalable approach. The use of proxy-overlays would make the server the client connects to function as a kind of non caching proxy, and in general 'be involved' in all of the requests, which again doesn't seam very desirable, and very single-point-of-failure.
All servers that are configured (customer servers excluded unless they opt/pay for it) will be configured in a failover way, I didn't mention this above to avoid too much complication.
What do you recommend for distributing the databases and still be able to easily use them? Do I overestimate the amount of traffic/work a server has in the proxy overlay method?
Germ van Eck Engineer
Station to Station B.V.
--
On Tue, Feb 22, 2011 at 05:07:27PM +0100, Germ van Ek wrote:
Note: the use of referrals to construct a Distributed Directory Service is extremely clumsy and not well supported by common clients. If an
Very true...
existing installation has already been built using referrals, the use of the chain overlay to hide the referrals will greatly improve the usability of the Directory system. A better approach would be to use explicitly defined local and proxy databases in subordinate configurations to provide a seamless view of the Distributed Directory.
The use of 1 single proxy cache server seams to 'ease the pain' a bit, but does not seam like a very scalable approach. The use of proxy-overlays would make the server the client connects to function as a kind of non caching proxy, and in general 'be involved' in all of the requests, which again doesn't seam very desirable, and very single-point-of-failure.
As your proxy server will not hold any databases there is no reason why you cannot have many copies of it. They can all be identical. You can equip them with caches if you want to, or just set them up to pass the queries through to the appropriate backend server. This removes the single point of failure, and (with caches) improves the overall throughput of the system.
Andrew
Andrew,
Thank your for replying to my question.
You can equip them with caches if you want to, or just set them up to
pass > the queries through to the appropriate backend server. Where a non-caching proxy is the same as a chain overlay?
In the LDAP proxy solution you suggested, could I then simply put a generic load balancer like Linux Virtual Server in front of the 'team' of LDAP proxies? I'd like to have a single IP / hostname I can use for the LDAP clients.
I like the idea of the caching proxies. The major advantage of it is that I expect that only a relatively small subset of the users to be active at one time, so the caches could be very small and thus very fast.
Best Regards, Germ van Eck -----Oorspronkelijk bericht----- Van: Andrew Findlay [mailto:andrew.findlay@skills-1st.co.uk] Verzonden: donderdag 24 februari 2011 17:28 Aan: Germ van Eck CC: openldap-technical@openldap.org Onderwerp: Re: Advise on distributed directory service
On Tue, Feb 22, 2011 at 05:07:27PM +0100, Germ van Ek wrote:
Note: the use of referrals to construct a Distributed Directory
Service
is extremely clumsy and not well supported by common clients. If an
Very true...
existing installation has already been built using referrals, the use
of
the chain overlay to hide the referrals will greatly improve the usability of the Directory system. A better approach would be to use explicitly defined local and proxy databases in subordinate configurations to provide a seamless view of the Distributed
Directory.
The use of 1 single proxy cache server seams to 'ease the pain' a bit, but does not seam like a very scalable approach. The use of proxy-overlays would make the server the client connects to function
as
a kind of non caching proxy, and in general 'be involved' in all of
the
requests, which again doesn't seam very desirable, and very single-point-of-failure.
As your proxy server will not hold any databases there is no reason why you cannot have many copies of it. They can all be identical. You can equip them with caches if you want to, or just set them up to pass the queries through to the appropriate backend server. This removes the single point of failure, and (with caches) improves the overall throughput of the system.
Andrew
Germ,
Germ van Ek schrieb am 22.02.2011 17:07 Uhr:
I am asked to design a replicated OpenLDAP implementation for use on 1500 of our customers servers who are now use a non-replicated configuration using the standard passwd/shadow backend combined with a PostgreSQL database. Our customers consist of primary schools and will use the database for authentication through Samba. The reason we want to replicate the data is so that we can offer email and other services from a central datacentre.
If every school has it own tree in the DIT, like ou=school-1,o=your organization,c=nl and you have no need to write directly to the directory except from the school itself (write in central could use referrals to school), you could set up the school ldap as syncprovider and replicate to a glued database of 1500 school-databases (consumer databases) in the datacenter where you have the whole DIT altogether o=your organization,c=nl -ou=school-1,o=your organization,c=nl -ou=school-2,o=your organization,c=nl -... -ou=school-1500,o=your organization,c=nl
Marc
openldap-technical@openldap.org