Hello,
I am asked to design a replicated OpenLDAP implementation for use on 1500 of our customers servers who are now use a non-replicated configuration using the standard passwd/shadow backend combined with a PostgreSQL database. Our customers consist of primary schools and will use the database for authentication through Samba. The reason we want to replicate the data is so that we can offer email and other services from a central datacentre. Having considered several options ( Multi-Master, MirrorMode ), after some consulting in our team, I've decided we'll opt for the 'simple' Master-Slave setup, with a Master at each customer site and 7 (virtualised) servers each handling an everage of 215 customer databases. Having 7 servers however has the disadvantage of not knowing on which server a customer's database will be when a user is trying to authenticate on their email for example. My first thought was that I would configure a special 'redirection' server only containing referrals. I 'hoped' all common clients would cache these referrals so that load on this redirection server would be low. I'm now doubting this choice for 2 reasons:
- The following of this referrals seams highly unstandardized. The biggest users of the referral functionality will probably be PHP, in the PHP manual the documentation on rebinding / referral chasing is not very thorough. Any automation of this is also not present, and I would need to implement any caching myself too. ( http://www.php.net/manual/en/function.ldap-set-rebind-proc.php ) This is 'annoying' for me, but I highly doubt third party applications will all implement this in a reasonable way, and I was not planning on customizing all of the software we use...
- In the notes on the documentation of referrals for OpenLDAP 2.4 ( http://www.openldap.org/doc/admin24/referrals.html ), the following note is present: Note: the use of referrals to construct a Distributed Directory Service is extremely clumsy and not well supported by common clients. If an existing installation has already been built using referrals, the use of the chain overlay to hide the referrals will greatly improve the usability of the Directory system. A better approach would be to use explicitly defined local and proxy databases in subordinate configurations to provide a seamless view of the Distributed Directory.
Though I am usually very stubborn, I try to avoid designing systems in a way the documentation says is not recommended.
The use of 1 single proxy cache server seams to 'ease the pain' a bit, but does not seam like a very scalable approach. The use of proxy-overlays would make the server the client connects to function as a kind of non caching proxy, and in general 'be involved' in all of the requests, which again doesn't seam very desirable, and very single-point-of-failure.
All servers that are configured (customer servers excluded unless they opt/pay for it) will be configured in a failover way, I didn't mention this above to avoid too much complication.
What do you recommend for distributing the databases and still be able to easily use them? Do I overestimate the amount of traffic/work a server has in the proxy overlay method?
Germ van Eck Engineer
Station to Station B.V.
--