Sam Tran skrev, on 02-01-2008 23:14:
Dear All,
We are running OL 2.3.39 on Centos 5 i386 or x86_64. We have one provider and three consumers (LDAP-sync repl).
Several applications perform LDAP write and read operations on the provider.
For the second time in 2 months, we had what it looked like a file descriptor leak on the provider: file descriptors were not closed at all or fast enough. At the same time, slapd was unresponsive. Here is what the logs shows:
[snip] Dec 31 10:00:02 info-ldap-003 slapd[29329]: conn=87886 fd=75 ACCEPT from IP=127.0.0.1:47391 (IP=0.0.0.0:389) Dec 31 10:00:02 info-ldap-003 slapd[29329]: conn=87887 fd=76 ACCEPT from IP=172.16.8.39:55065 (IP=0.0.0.0:636) Dec 31 10:00:02 info-ldap-003 slapd[29329]: conn=87887 fd=76 TLS established tls_ssf=256 ssf=256 Dec 31 10:00:36 info-ldap-003 slapd[29329]: conn=87888 fd=77 ACCEPT from IP=172.16.77.245:1285 (IP=0.0.0.0:389) Dec 31 10:01:01 info-ldap-003 slapd[29329]: conn=87889 fd=78 ACCEPT from IP=172.16.8.39:55073 (IP=0.0.0.0:636) Dec 31 10:01:01 info-ldap-003 slapd[29329]: conn=87890 fd=79 ACCEPT from IP=172.16.8.39:55074 (IP=0.0.0.0:636) Dec 31 10:01:01 info-ldap-003 slapd[29329]: conn=87891 fd=80 ACCEPT from IP=172.16.8.39:55075 (IP=0.0.0.0:636) Dec 31 10:01:01 info-ldap-003 slapd[29329]: conn=87889 fd=78 TLS established tls_ssf=256 ssf=256 Dec 31 10:01:01 info-ldap-003 slapd[29329]: conn=87890 fd=79 TLS established tls_ssf=256 ssf=256 Dec 31 10:01:01 info-ldap-003 slapd[29329]: conn=87891 fd=80 TLS established tls_ssf=256 ssf=256 Dec 31 10:01:28 info-ldap-003 slapd[29329]: conn=87892 fd=81 ACCEPT from IP=10.0.0.23:47414 (IP=0.0.0.0:389) Dec 31 10:01:28 info-ldap-003 slapd[29329]: conn=87892 fd=81 closed (connection lost) Dec 31 10:01:28 info-ldap-003 slapd[29329]: conn=87893 fd=81 ACCEPT from IP=10.0.0.23:47415 (IP=0.0.0.0:389) Dec 31 10:01:40 info-ldap-003 slapd[29329]: conn=87894 fd=82 ACCEPT from IP=172.16.8.31:47692 (IP=0.0.0.0:636) Dec 31 10:01:40 info-ldap-003 slapd[29329]: conn=87894 fd=82 closed (TLS negotiation failure) Dec 31 10:01:44 info-ldap-003 slapd[29329]: conn=87895 fd=82 ACCEPT from IP=10.0.0.23:47741 (IP=0.0.0.0:636) Dec 31 10:01:44 info-ldap-003 slapd[29329]: conn=87895 fd=82 closed (TLS negotiation failure) Dec 31 10:01:44 info-ldap-003 slapd[29329]: conn=87896 fd=82 ACCEPT from IP=172.16.8.31:47757 (IP=0.0.0.0:636) Dec 31 10:01:44 info-ldap-003 slapd[29329]: conn=87896 fd=82 closed (TLS negotiation failure) Dec 31 10:01:46 info-ldap-003 slapd[29329]: conn=87897 fd=82 ACCEPT from IP=172.16.8.31:47760 (IP=0.0.0.0:389) Dec 31 10:01:46 info-ldap-003 slapd[29329]: conn=87897 fd=82 closed (connection lost) Dec 31 10:01:46 info-ldap-003 slapd[29329]: conn=87898 fd=82 ACCEPT from IP=172.16.8.31:47761 (IP=0.0.0.0:389) Dec 31 10:01:49 info-ldap-003 slapd[29329]: conn=87899 fd=83 ACCEPT from IP=172.16.8.31:47774 (IP=0.0.0.0:389) Dec 31 10:01:54 info-ldap-003 slapd[29329]: conn=87900 fd=84 ACCEPT from IP=172.16.8.31:47785 (IP=0.0.0.0:389) Dec 31 10:01:54 info-ldap-003 slapd[29329]: conn=87900 fd=84 closed (connection lost) Dec 31 10:01:54 info-ldap-003 slapd[29329]: conn=87901 fd=84 ACCEPT from IP=172.16.8.31:47786 (IP=0.0.0.0:389) Dec 31 10:01:57 info-ldap-003 slapd[29329]: conn=87902 fd=85 ACCEPT from IP=172.16.8.31:47788 (IP=0.0.0.0:389) Dec 31 10:02:02 info-ldap-003 slapd[29329]: conn=87903 fd=86 ACCEPT from IP=172.16.8.39:55083 (IP=0.0.0.0:636) Dec 31 10:02:02 info-ldap-003 slapd[29329]: conn=87903 fd=86 TLS established tls_ssf=256 ssf=256 Dec 31 10:02:02 info-ldap-003 slapd[29329]: conn=87904 fd=87 ACCEPT from IP=172.16.8.39:55086 (IP=0.0.0.0:636) Dec 31 10:02:02 info-ldap-003 slapd[29329]: conn=87904 fd=87 TLS established tls_ssf=256 ssf=256 Dec 31 10:02:02 info-ldap-003 slapd[29329]: conn=87905 fd=88 ACCEPT from IP=172.16.8.39:55088 (IP=0.0.0.0:636) Dec 31 10:02:02 info-ldap-003 slapd[29329]: conn=87905 fd=88 TLS established tls_ssf=256 ssf=256 Dec 31 10:02:23 info-ldap-003 slapd[29329]: conn=87906 fd=89 ACCEPT from IP=172.16.8.31:47816 (IP=0.0.0.0:389) Dec 31 10:02:23 info-ldap-003 slapd[29329]: conn=87906 fd=89 closed (connection lost) Dec 31 10:02:23 info-ldap-003 slapd[29329]: conn=87907 fd=89 ACCEPT from IP=172.16.8.31:47817 (IP=0.0.0.0:389) Dec 31 10:02:26 info-ldap-003 slapd[29329]: conn=87908 fd=90 ACCEPT from IP=172.16.8.31:47819 (IP=0.0.0.0:389) Dec 31 10:02:31 info-ldap-003 slapd[29329]: conn=87909 fd=91 ACCEPT from IP=172.16.8.31:47822 (IP=0.0.0.0:389) Dec 31 10:02:31 info-ldap-003 slapd[29329]: conn=87909 fd=91 closed (connection lost) Dec 31 10:02:31 info-ldap-003 slapd[29329]: conn=87910 fd=91 ACCEPT from IP=172.16.8.31:47823 (IP=0.0.0.0:389) Dec 31 10:02:34 info-ldap-003 slapd[29329]: conn=87911 fd=92 ACCEPT from IP=172.16.8.31:47825 (IP=0.0.0.0:389) Dec 31 10:02:36 info-ldap-003 slapd[29329]: conn=87912 fd=93 ACCEPT from IP=172.16.8.38:4592 (IP=0.0.0.0:636) Dec 31 10:02:36 info-ldap-003 slapd[29329]: conn=87912 fd=93 TLS established tls_ssf=128 ssf=128 Dec 31 10:02:59 info-ldap-003 slapd[29329]: conn=87913 fd=94 ACCEPT from IP=172.16.8.31:47834 (IP=0.0.0.0:389) Dec 31 10:02:59 info-ldap-003 slapd[29329]: conn=87913 fd=94 closed (connection lost) Dec 31 10:02:59 info-ldap-003 slapd[29329]: conn=87914 fd=94 ACCEPT from IP=172.16.8.31:47835 (IP=0.0.0.0:389) Dec 31 10:03:01 info-ldap-003 slapd[29329]: conn=87915 fd=95 ACCEPT from IP=172.16.8.39:39120 (IP=0.0.0.0:636) Dec 31 10:03:01 info-ldap-003 slapd[29329]: conn=87916 fd=96 ACCEPT from IP=172.16.8.39:39121 (IP=0.0.0.0:636) Dec 31 10:03:01 info-ldap-003 slapd[29329]: conn=87915 fd=95 TLS established tls_ssf=256 ssf=256 [snip]
I restarted slapd, which fixed the problem.
The first time that problem occurred, slapd ran out of file descriptor. I don't know what triggered the problem. Prior to the problem there was no increase in load, all LDAP operations were performed successfully.
I would appreciate it if anyone could give me some pointers on how to troubleshoot the problem.
We had this on heavily loaded OL 2.3 servers and solved it by setting a low(ish) idletimeout value (see 'man slapd.conf').
Best,
--Tonni