Folks,
I'm going through the documentation at http://www.openldap.org/doc/admin24/, the OpenLDAP FAQ-o-Matic at http://www.openldap.org/faq/data/cache/1.html, and the archives of the various Open-LDAP mailing lists, but I have not yet found anything that discusses how one might want to architect a large-scale OpenLDAP system with multiple masters, multiple slaves, etc... for best performance and low latency.
In our case, we have OpenLDAP 2.3.something (a few versions behind the official latest stable release), and we've recently hit our four millionth object (at a large University with something like 48,000 students, 2700 faculty, and 19,000 employees), and we're running into some performance issues that are going to keep us from rolling out some other large projects, at least until we can get the problems resolved.
I do not yet understand a great deal about how our existing OpenLDAP systems are designed, but I am curious to learn what kinds of recommendations you folks would have for a large scale system like this.
In the far, dark, distant past, I know that OpenLDAP did not handle situations well when you had both updates and reads occurring on the same system, so the recommendation at the time was to make all updates on the master server, then replicate that out to the slaves where all the read operations would occur. You could even go so far as to set up slaves on pretty much every single major client machine, for maximum distribution and replication of the data, and maximum scalability of the overall LDAP system.
I know that modern versions of OpenLDAP are able to handle a mix of both updates and reads much better, so that the old style architecture is not so necessary. But for a large-scale system like we have, would it not be wise to use the old-style architecture for maximum performance and scalability?
If you did use a multi-master cluster pair environment that handled all the updates and all the LDAP queries that were generated, what kind of performance do you think you should reasonably be able to get with the latest version of 2.4.whatever on high-end hardware, and what kind of hardware would you consider to be "high-end" for that environment? Is CPU more important, or RAM, or disk space/latency?
Alternatively, if you went to a three-level master(s)->proxies->slaves architecture [0], what kind of performance would you expect to be able to get, and how many machines would you expect that to be able to scale to? Are there any other major issues to be concerned about with that kind of architecture, like latency of updates getting pushed out to the leaf-node slaves?
How about the ultimate maximum distribution scenario, where you put an LDAP slave on virtually every major LDAP client machine?
Any and all advice you can provide would be appreciated, and in particular I would greatly appreciate it if you can provide any references to documentation, FAQs, mailing list archives where I can read more.
Thanks!
[0] Is this test045? As I believe is mentioned at http://www.openldap.org/lists/openldap-software/200707/msg00320.html?