Brad Knowles wrote:
Folks,
I'm going through the documentation at http://www.openldap.org/doc/admin24/, the OpenLDAP FAQ-o-Matic at http://www.openldap.org/faq/data/cache/1.html, and the archives of the various Open-LDAP mailing lists, but I have not yet found anything that discusses how one might want to architect a large-scale OpenLDAP system with multiple masters, multiple slaves, etc... for best performance and low latency.
In our case, we have OpenLDAP 2.3.something (a few versions behind the official latest stable release), and we've recently hit our four millionth object (at a large University with something like 48,000 students, 2700 faculty, and 19,000 employees), and we're running into some performance issues that are going to keep us from rolling out some other large projects, at least until we can get the problems resolved.
In my experience, 4 million objects (at around 3KB per entry) is near the limit of what will fit into 16GB of RAM. Sounds like you need a server with more than 16GB if you want to keep growing and not be waiting on disks.
I do not yet understand a great deal about how our existing OpenLDAP systems are designed, but I am curious to learn what kinds of recommendations you folks would have for a large scale system like this.
In the far, dark, distant past, I know that OpenLDAP did not handle situations well when you had both updates and reads occurring on the same system, so the recommendation at the time was to make all updates on the master server, then replicate that out to the slaves where all the read operations would occur. You could even go so far as to set up slaves on pretty much every single major client machine, for maximum distribution and replication of the data, and maximum scalability of the overall LDAP system.
The single-master constraints on OpenLDAP were never about performance. Even with OpenLDAP 2.2 the concurrent read/write rates for back-bdb are faster than any other directory server. It's always been about data consistency, and the fact that it's so easy to lose it in a multi-master setup.
I know that modern versions of OpenLDAP are able to handle a mix of both updates and reads much better, so that the old style architecture is not so necessary. But for a large-scale system like we have, would it not be wise to use the old-style architecture for maximum performance and scalability?
If you did use a multi-master cluster pair environment that handled all the updates and all the LDAP queries that were generated, what kind of performance do you think you should reasonably be able to get with the latest version of 2.4.whatever on high-end hardware,
You've been brainwashed by all the marketing lies other LDAP vendors tell about multi-master replication. Multi-master has no relation to performance. It's only about fault tolerance and high availability. No matter whether you choose a single-master or a multi-master setup, with the same number of machines, the same number of writes must be propagated to all servers, so the overall performance will be the same.
and what kind of hardware would you consider to be "high-end" for that environment?
That's a pointless question. The right question is - how fast do you need it to be? What load are you experiencing now, what constitutes a noticeable delay, and how often do you see those?
Is CPU more important, or RAM, or disk space/latency?
If you have enough RAM, disk latency shouldn't be a problem. Disk space is so cheap today that it should never be a problem. CPU, well, that depends on your performance target.
Alternatively, if you went to a three-level master(s)->proxies->slaves architecture [0], what kind of performance would you expect to be able to get, and how many machines would you expect that to be able to scale to? Are there any other major issues to be concerned about with that kind of architecture, like latency of updates getting pushed out to the leaf-node slaves?
Yes.
How about the ultimate maximum distribution scenario, where you put an LDAP slave on virtually every major LDAP client machine?
Generally I like the idea of having compact/simple slapd configs spread all over. With the old slapd.conf that would have been rather painful to administer though. Also in general, more moving parts means more things that can break.
Any and all advice you can provide would be appreciated, and in particular I would greatly appreciate it if you can provide any references to documentation, FAQs, mailing list archives where I can read more.