Advise on distributed directory service - openldap-technical

22 Feb 2011


      Hello,
I am asked to design a replicated OpenLDAP implementation for use on
1500 of our customers servers who are now use a non-replicated
configuration using the standard passwd/shadow backend combined with a
PostgreSQL database. Our customers consist of primary schools and will
use the database for authentication through Samba. The reason we want to
replicate the data is so that we can offer email and other services from
a central datacentre.
Having considered several options ( Multi-Master, MirrorMode ), after
some consulting in our team, I've decided we'll opt for the 'simple'
Master-Slave setup, with a Master at each customer site and 7
(virtualised) servers each handling an everage of 215 customer
databases.
Having 7 servers however has the disadvantage of not knowing on which
server a customer's database will be when a user is trying to
authenticate on their email for example. My first thought was that I
would configure a special 'redirection' server only containing
referrals. I 'hoped' all common clients would cache these referrals so
that load on this redirection server would be low.
I'm now doubting this choice for 2 reasons:
- The following of this referrals seams highly unstandardized. The
biggest users of the referral functionality will probably be PHP, in the
PHP manual the documentation on rebinding / referral chasing is not very
thorough. Any automation of this is also not present, and I would need
to implement any caching myself too. (
http://www.php.net/manual/en/function.ldap-set-rebind-proc.php ) This is
'annoying' for me, but I highly doubt third party applications will all
implement this in a reasonable way, and I was not planning on
customizing all of the software we use...
- In the notes on the documentation of referrals for OpenLDAP 2.4 (
http://www.openldap.org/doc/admin24/referrals.html ), the following note
is present:
Note: the use of referrals to construct a Distributed Directory Service
is extremely clumsy and not well supported by common clients. If an
existing installation has already been built using referrals, the use of
the chain overlay to hide the referrals will greatly improve the
usability of the Directory system. A better approach would be to use
explicitly defined local and proxy databases in subordinate
configurations to provide a seamless view of the Distributed Directory.
Though I am usually very stubborn, I try to avoid designing systems in a
way the documentation says is not recommended.
The use of 1 single proxy cache server seams to 'ease the pain' a bit,
but does not seam like a very scalable approach. The use of
proxy-overlays would make the server the client connects to function as
a kind of non caching proxy, and in general 'be involved' in all of the
requests, which again doesn't seam very desirable, and very
single-point-of-failure.
All servers that are configured (customer servers excluded unless they
opt/pay for it) will be configured in a failover way, I didn't mention
this above to avoid too much complication.
What do you recommend for distributing the databases and still be able
to easily use them? Do I overestimate the amount of traffic/work a
server has in the proxy overlay method?
Germ van Eck
Engineer
Station to Station B.V.
--