Hi all,
My latest test system includes a Kerberos server that uses OpenLDAP via IPC as its back-end database. It usually works, but not always. For example, recently, after failing to get kadmin to add a new principal to the Kerberos database, I found this error in the provider's syslog:
Feb 10 22:37:29 kls1 slapd[1722]: bdb_db_cache: db_open(entryUUID) failed: Too many open files (24) Feb 10 22:37:29 kls1 slapd[1722]: bdb_index_read: Could not open DB entryUUID Feb 10 22:37:29 kls1 slapd[1722]: conn=4 op=13 RESULT tag=105 err=80 text=index generation failed
A restart of the Kerberos KDC and admin servers seemed to solve the problem, but obviously that's not ideal. Later on, I had a look at the numbers of open files on the system:
~# lsof -i |grep slapd slapd 1722 openldap 8u IPv6 4603 TCP *:ldap (LISTEN) slapd 1722 openldap 9u IPv4 4604 TCP *:ldap (LISTEN) slapd 1722 openldap 545u IPv4 12823 TCP kls1.example.com:ldap->kls2.example.com:51555 (ESTABLISHED) slapd 1722 openldap 744u IPv4 8899 TCP kls1.example.com:ldap->kls2.example.com:49100 (ESTABLISHED)
545 and 745u!? A restart of the Kerberos servers didn't make a difference, although restarting slapd brought these values down to 8 and 9u respectively. However, I have no idea what caused these numbers to rise. See my provider/master server's config files below.
Does anyone have an idea what might be going on and how I might prevent this situation from occurring again?
Thanks,
Jaap
==/etc/ldap/slapd.conf================
include /etc/ldap/schema/core.schema include /etc/ldap/schema/cosine.schema include /etc/ldap/schema/nis.schema include /etc/ldap/schema/inetorgperson.schema include /etc/ldap/schema/kerberos.schema
pidfile /var/run/slapd/slapd.pid
argsfile /var/run/slapd/slapd.args
modulepath /usr/lib/ldap moduleload back_hdb
sizelimit 500
tool-threads 1
authz-regexp uid=admin,cn=example.com,cn=gssapi,cn=auth cn=admin,dc=example,dc=com
authz-regexp uid=ldap/([^/.]+).example.com,cn=example.com,cn=gssapi,cn=auth cn=$1,ou=consumers,dc=example,dc=com
authz-regexp uid=([^,]+),cn=example.com,cn=gssapi,cn=auth uid=$1,ou=people,dc=example,dc=com
sasl-realm EXAMPLE.COM
authz-policy to
backend hdb
database hdb
suffix "dc=example,dc=com"
directory "/var/lib/ldap"
dbconfig set_cachesize 0 2097152 0
dbconfig set_lk_max_objects 1500 dbconfig set_lk_max_locks 1500 dbconfig set_lk_max_lockers 1500
index objectClass eq index uid eq index krbPrincipalName eq,pres,sub index entryUUID eq index entryCSN eq
lastmod on
checkpoint 512 30
access to attrs=userPassword,shadowLastChange by dn="cn=admin,dc=example,dc=com" write by dn="cn=kls2,ou=consumers,dc=example,dc=com" read by anonymous auth by self write by * none
access to dn.subtree="ou=krb5,dc=example,dc=com" by dn="cn=admin,dc=example,dc=com" write by dn="cn=adm-srv,ou=krb5,dc=example,dc=com" write by dn="cn=kdc-srv,ou=krb5,dc=example,dc=com" read by dn="cn=kls2,ou=consumers,dc=example,dc=com" read by * none
access to dn.base="" by * read
access to * by dn="cn=admin,dc=example,dc=com" write by * read
moduleload syncprov overlay syncprov
syncprov-checkpoint 100 10 syncprov-sessionlog 100
======================================
==/etc/default/slapd==================
SLAPD_CONF=
SLAPD_USER="openldap"
SLAPD_GROUP="openldap"
SLAPD_PIDFILE=
SLAPD_SERVICES="ldap:/// ldapi:///"
SLAPD_SENTINEL_FILE=/etc/ldap/noslapd
export KRB5_KTNAME=/etc/krb5.keytab
SLAPD_OPTIONS=""
======================================
==/etc/krb5.conf======================
[libdefaults] default_realm = EXAMPLE.COM forwardable = true proxiable = true
[realms] EXAMPLE.COM = { kdc = kls1.example.com admin_server = kls.example.com database_module = openldap_ldapconf }
[domain_realm] .example.com = EXAMPLE.COM example.com = EXAMPLE.COM
[login] krb4_convert = true
[dbmodules] openldap_ldapconf = { db_library = kldap ldap_kerberos_container_dn = ou=krb5,dc=example,dc=com ldap_kdc_dn = cn=kdc-srv,ou=krb5,dc=example,dc=com ldap_kadmind_dn = cn=adm-srv,ou=krb5,dc=example,dc=com ldap_service_password_file = /etc/krb5kdc/service.keyfile ldap_conns_per_server = 5 }
[logging] kdc = FILE:/var/log/krb5/kdc.log admin_server = FILE:/var/log/krb5/kadmin.log default = FILE:/var/log/krb5/klib.log
====================================== Note: "ldap_servers" option omitted, as the default is to use IPC. ======================================
Jaap Winius jwinius@umrk.nl writes:
Hi all,
My latest test system includes a Kerberos server that uses OpenLDAP via IPC as its back-end database. It usually works, but not always. For example, recently, after failing to get kadmin to add a new principal to the Kerberos database, I found this error in the provider's syslog:
Feb 10 22:37:29 kls1 slapd[1722]: bdb_db_cache: db_open(entryUUID) failed: Too many open files (24) Feb 10 22:37:29 kls1 slapd[1722]: bdb_index_read: Could not open DB entryUUID Feb 10 22:37:29 kls1 slapd[1722]: conn=4 op=13 RESULT tag=105 err=80 text=index generation failed
A restart of the Kerberos KDC and admin servers seemed to solve the problem, but obviously that's not ideal. Later on, I had a look at the numbers of open files on the system:
What is the output of ulimit -Sn and ulimit -Hn ? If the output differs increase the value of -Sn to max. -Hn
-Dieter
Quoting Dieter Kluenter dieter@dkluenter.de:
What is the output of ulimit -Sn and ulimit -Hn ? If the output differs increase the value of -Sn to max. -Hn
~# ulimit -Sn 1024 ~# ulimit -Hn 1024 ~# _
Would you suggest that e.g. "ulimit -n unlimited" be added to /etc/profile?
Thanks,
Jaap
Jaap Winius jwinius@umrk.nl writes:
Quoting Dieter Kluenter dieter@dkluenter.de:
What is the output of ulimit -Sn and ulimit -Hn ? If the output differs increase the value of -Sn to max. -Hn
~# ulimit -Sn 1024 ~# ulimit -Hn 1024 ~# _
Would you suggest that e.g. "ulimit -n unlimited" be added to /etc/profile?
Unfortunately the hard limit is set to 1024 so you have to increase soft limit and hard limit. You should probably set soft limit to 2048 and hard limit to 8192
-Dieter
Quoting Dieter Kluenter dieter@dkluenter.de:
Unfortunately the hard limit is set to 1024 so you have to increase soft limit and hard limit. You should probably set soft limit to 2048 and hard limit to 8192
Right. I've created an /etc/initscript, exactly like the sample that's described in "man initscript". That should fix things.
I'm still worried, though: a recent "lsof -i| grep slapd" showed that the number of open file descriptors for the refreshAndPersist replication process was up to 1023. Surely that's not normal. Could this be due to a file descriptor leak?
I'm using Debian lenny with slapd v2.4.11-1.
Cheers,
Jaap
Jaap Winius jwinius@umrk.nl writes:
Quoting Dieter Kluenter dieter@dkluenter.de:
Unfortunately the hard limit is set to 1024 so you have to increase soft limit and hard limit. You should probably set soft limit to 2048 and hard limit to 8192
Right. I've created an /etc/initscript, exactly like the sample that's described in "man initscript". That should fix things.
I'm still worried, though: a recent "lsof -i| grep slapd" showed that the number of open file descriptors for the refreshAndPersist replication process was up to 1023. Surely that's not normal. Could this be due to a file descriptor leak?
I'm using Debian lenny with slapd v2.4.11-1.
Is there any chaining, back-ldap or back-meta involved? I am not aware of any FD leak, although 2.4.11 is rather old (July 2008). As since then numerous syncrepl bugs have been fixed, you should probably update to 2.4.21
-Dieter
Quoting Dieter Kluenter dieter@dkluenter.de:
Is there any chaining, back-ldap or back-meta involved?
On the consumer, I'm using the chain overlay, which requires the back_ldap module. The large numbers of open file descriptors occur on both the consumer and the provider, although this may be symptomatic.
I am not aware of any FD leak, although 2.4.11 is rather old (July 2008). As since then numerous syncrepl bugs have been fixed, you should probably update to 2.4.21
I'll give it a spin and see if it makes a difference.
Cheers,
Jaap
Hi all,
After more research, I discovered that the actual cause of the problem is indeed a file descriptor leak: not in slapd, but in krb524d -- the Kerberos V to IV ticket conversion service -- which is part of the krb5-kdc package in Debian. It occurs when Kerberos is configured to use LDAP as its back-end database.
I filed a Debian bug report, only to learn that it was a known problem that will likely not be fixed, since krb524d has been removed from current krb5 releases. I was also told that there seems to be a related, much slower leak when using krb5-kdc with LDAP, but thankfully that one is more likely to be fixed.
Cheers,
Jaap
openldap-technical@openldap.org