After some users reported that slapd had changed size enormously between two zimbra releases, I spent some time tracking down why. Apparently, at least on linux, the startup size of slapd is directly related to the number of file descriptors it can access.
For example, with 1024 file descriptors, slapd was 18MB resident and around 100MB virtual in size. With some 550,000 file descriptors, slapd was 500MB+ virtual and 327MB resident. This seems a bit odd. Is this simply a "feature" of epoll() (I'm on a Linux 2.6 kernel)?
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
Quanah Gibson-Mount wrote:
After some users reported that slapd had changed size enormously between two zimbra releases, I spent some time tracking down why. Apparently, at least on linux, the startup size of slapd is directly related to the number of file descriptors it can access.
For example, with 1024 file descriptors, slapd was 18MB resident and around 100MB virtual in size. With some 550,000 file descriptors, slapd was 500MB+ virtual and 327MB resident. This seems a bit odd. Is this simply a "feature" of epoll() (I'm on a Linux 2.6 kernel)?
No, this is not OS dependent at all. slapd allocates its own Connection array based on the number of available descriptors. There's nothing unusual going on here, though 500K+ descriptors seems a bit excessive. Unless you have a server listening on multiple network interfaces, the most connections you're likely to get is 32768 or shy of 65536, depending on OS. You should really think about what you're trying to accomplish and what the realistic constraints actually are.
--On Wednesday, February 27, 2008 8:51 PM -0800 Howard Chu hyc@symas.com wrote:
No, this is not OS dependent at all. slapd allocates its own Connection array based on the number of available descriptors. There's nothing unusual going on here, though 500K+ descriptors seems a bit excessive. Unless you have a server listening on multiple network interfaces, the most connections you're likely to get is 32768 or shy of 65536, depending on OS. You should really think about what you're trying to accomplish and what the realistic constraints actually are.
On deployments with multi-million users (of which we have), it is not unreasonable that between slapd/imap/pop/mysql etc for there to be a need for a high number of file descriptors in use for the zimbra user. However, I think it may be reasonable to break slapd out into its own user, so it can use a reduced set of file descriptors.
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
Quanah Gibson-Mount quanah@zimbra.com writes:
On deployments with multi-million users (of which we have), it is not unreasonable that between slapd/imap/pop/mysql etc for there to be a need for a high number of file descriptors in use for the zimbra user. However, I think it may be reasonable to break slapd out into its own user, so it can use a reduced set of file descriptors.
You could also limit file descriptors specifically for slapd by doing so in the slapd startup script, without running it as a different user.
Quanah Gibson-Mount wrote:
--On Wednesday, February 27, 2008 8:51 PM -0800 Howard Chuhyc@symas.com wrote:
No, this is not OS dependent at all. slapd allocates its own Connection array based on the number of available descriptors. There's nothing unusual going on here, though 500K+ descriptors seems a bit excessive. Unless you have a server listening on multiple network interfaces, the most connections you're likely to get is 32768 or shy of 65536, depending on OS. You should really think about what you're trying to accomplish and what the realistic constraints actually are.
On deployments with multi-million users (of which we have), it is not unreasonable that between slapd/imap/pop/mysql etc for there to be a need for a high number of file descriptors in use for the zimbra user. However, I think it may be reasonable to break slapd out into its own user, so it can use a reduced set of file descriptors.
There are only 65535 possible TCP/IP port numbers. On any given network interface, a number of those ports will be in use by other services, some are just reserved, and a number of them will be in use by outbound connections. Again, unless your machine has more than 8 active network interfaces, it's impossible to even have that many incoming connections. No matter how many millions of users you have.
This configuration is totally illogical. Even Google was only running their OpenLDAP installations with 3072 connections, and I doubt you're getting more load than gmail...
--On Wednesday, February 27, 2008 9:41 PM -0800 Howard Chu hyc@symas.com wrote:
This configuration is totally illogical. Even Google was only running their OpenLDAP installations with 3072 connections, and I doubt you're getting more load than gmail...
Why are you assuming that connections are the only things that eat up file descriptors? And as I noted, OpenLDAP is not the only thing running on the server.
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
Quanah Gibson-Mount wrote:
--On Wednesday, February 27, 2008 9:41 PM -0800 Howard Chuhyc@symas.com wrote:
This configuration is totally illogical. Even Google was only running their OpenLDAP installations with 3072 connections, and I doubt you're getting more load than gmail...
Why are you assuming that connections are the only things that eat up file descriptors? And as I noted, OpenLDAP is not the only thing running on the server.
We're not talking about a server-wide setting here, we're talking about a per-process setting. You really believe any of those apps you talked about will ever have 524,288 files open at once? And then multiply that by multiple processes? Do you have any idea how much kernel memory it requires just to have that many file buffers?
On Thursday 28 February 2008 07:26:26 Quanah Gibson-Mount wrote:
--On Wednesday, February 27, 2008 8:51 PM -0800 Howard Chu hyc@symas.com
wrote:
No, this is not OS dependent at all. slapd allocates its own Connection array based on the number of available descriptors. There's nothing unusual going on here, though 500K+ descriptors seems a bit excessive. Unless you have a server listening on multiple network interfaces, the most connections you're likely to get is 32768 or shy of 65536, depending on OS. You should really think about what you're trying to accomplish and what the realistic constraints actually are.
On deployments with multi-million users (of which we have), it is not unreasonable that between slapd/imap/pop/mysql etc for there to be a need for a high number of file descriptors in use for the zimbra user. However, I think it may be reasonable to break slapd out into its own user, so it can use a reduced set of file descriptors.
Well, the question is whether it is a good design to have *all* of those services running as the same user.
As a site currently running qmail-ldap+courier imap+mysql (for webmail/spam preferences), where smtpd runs as one user, pop3d as another, and courier imap also it's own (and of course, mysql running as mysql, OpenLDAP running as ldap), this whole "let's run everything as the zimbra user" is concerning (considering we are just starting a project to migrate to Zimbra, that may end up being more than 1 million users if the first half-million goes ok).
For instance, I don't like that fact that the IMAP server process has write access to the LDAP database directory/files, or the fact that an apache vulnerability could result in an attacker having write access to the entire mailstore. Our current setup (architecture, as well as software configuration) has none of these security risks.
Regards, Buchan