Quanah Gibson-Mount wrote:
RAM is probably the most important, but you also will want fast disks, proper partitioning of the logs separate from the database and logs, and I recommend a non-journaling filesystem. 2 or more cores is also useful. Unfortunately I don't really see enough information from your end (yet) to really say much beyond that.
A non-journaling filesystem? Really? Can you explain why you would make that performance trade-off?
Keep in mind that there are two journaling methodologies that are prevalent: 1: meta-data only journaling (most common) 2: full-block journaling, or data journaling (available in several filesystems, but usually defaulted OFF)
I can understand being reticent about #2 if (and only if) you are in a high-write scenario, which doesn't seem to be clear from the original post.. if the LDAP server is a high-read, low-write server (as most are) then either type of journaling should be of negligible performance impact.
Even in a high-write scenario, method #1 (meta-data only) journaling should be of negligible impact.
Since the original poster mentions high end hardware, I'm going to assume Solaris is the platform.. If so, he'll likely be choosing between either UFS or ZFS..
In the case of UFS, logging (meta-data only journaling) is on by default in Solaris 10. Performance impact here should be negligible, even in a high-write scenario.
In the case of ZFS, logging is mandatory and cannot be disabled or circumvented. This is due to the nature of the filesystem -- it employs a copy-on-write strategy for ALL blocks, all of the time. This is the equivelant to full-block journaling, and I'm not sure how much it would impact the performance of a high-write scenario. I haven't used ZFS much in the real world yet.
If we're on Linux, then our choice of filesystem is quite a bit more varied, but most of the logging filesystem choices on Linux have both meta-data only and full-block journaling options, with meta-data only being the default.
Let me know if I'm missing something here -- I'm not intimately familiar with bdb's storage mechanisms.. I suppose if it created and deleted a small file per transaction or something similar, then you MAY see a degredation from meta-data only journaling. As a system administrator though, I'd be inclined to find performance elsewhere... pry the journaling filesystem out of my cold dead fingers ;-)
Cheers,
Joseph Dickson -- Sr. UNIX Admin joseph.dickson@meritain.com | 800.748.0003 ext 2151
--On Thursday, January 24, 2008 12:12 PM -0600 Brad Knowles b.knowles@its.utexas.edu wrote:
I do not yet understand a great deal about how our existing OpenLDAP systems are designed, but I am curious to learn what kinds of recommendations you folks would have for a large scale system like this.
This is generally good information to know...
But basically, have you read over the information on understanding your system requirements? I.e., how to properly tune DB_CONFIG and slapd.conf?
In the far, dark, distant past, I know that OpenLDAP did not handle situations well when you had both updates and reads occurring on the same system, so the recommendation at the time was to make all updates on the master server, then replicate that out to the slaves where all the read operations would occur. You could even go so far as to set up slaves on pretty much every single major client machine, for maximum distribution and replication of the data, and maximum scalability of the overall LDAP system.
Updates -> master is always recommended. You can set up multi-master with 2.4, but it will be slower than a single master scenario. The general best practice for fail over is to have a primary master that receives writes, and a secondary master that is getting the updates, and will take over via fail-over mechanisms if the primary goes down, becoming the new primary.
If you did use a multi-master cluster pair environment that handled all the updates and all the LDAP queries that were generated, what kind of performance do you think you should reasonably be able to get with the latest version of 2.4.whatever on high-end hardware, and what kind of hardware would you consider to be "high-end" for that environment? Is CPU more important, or RAM, or disk space/latency?