Tommy Pham wrote:
My concerns are not just about performance for 1 box setup or 1 master with multiple slave replications and proxies. I'm more interested in the robustness such as Dynamic Schema(s), Multi-Master Replication, and Dynamic configuration (as featured in Apache DS).
Dynamic configuration and dynamic loading of schema have been supported since OpenLDAP 2.3. Multi-master replication is supported in OpenLDAP 2.4 (although in general, actual multi-master usage is almost always the wrong thing to do; floating master or single-master with hot standby are the only reliable approaches).
Multi-master or cluster setup have higher reliability and performance under heavy load with large data in my experience.
What experience is that? It would help to know what your point of reference is. What do you define as heavy load or large data? What is your definition of reliability? We've run OpenLDAP 2.3 on an SGI Altix with 32 Itanium CPUs on a database of over 150 million entries, delivering transaction rates of over 22,000 searches per second concurrent with over 4800 modifications per second, sustained for several hours. We know that OpenLDAP is unique in these capabilities because several other directory server packages also participated in these tests but most of them failed hard at much smaller sizes. The only other one to survive to the 150 million entry mark was turning in transaction rates orders of magnitude slower than ours.
On a dual-processor AMD Opteron server the slapd frontend can process over 32,000 authentications per second on 100Mbps ethernet - that's equivalent to over 128,000 packets per second, or over 90% of the theoretical bandwidth of the medium. In a clustered environment you'll never get rates this high or latencies this low, because of the overhead in communicating with a remote DB server/cluster.
Also, because I'm migrating from MS based platform, I intend to integrate other application servers into LDAP as well such as DNS (via bind-dlz), FTP, e-mail & groupware, Samba, etc... in the same way as MS integrates DNS and Exchange in it's Active Directory. Will OpenLDAP with back-bdb/hdb support all of that and still perform well when there are over millions of entries?
Yes, easily, and far better than anything else could ever hope to.
As for native DB support vs layer like ODBC, why not just use the DB's native client library?
That only eliminates part of the overhead. Back-bdb's storage format is also highly optimized; getting raw access to the data of a relational system still means accessing individual rows and columns. This is still a significant performance cost.
(I guess this falls in line with development mailing list more than this mailing list.) I understand that "a directory is a specialized database optimized for reading, browsing and searching" and not writing. That's why I opt for having dedicated RDBMS vs embedded for distributed computing... just as enterprise applications are developed in n-tier.
Separating the OpenLDAP frontend from the storage backend offers no benefits; it only incurs additional costs in performance and administration overhead. N-tier architectures make sense in large enterprises for keeping data close to where it will be used. But they don't offer any actual reliability benefits. Simple algebra tells you that these designs decrease MTBF, they can never increase it.