Hi guys, I've discovered a major memory issue. We have an OpenLDAP server (2.4.11) running with almost 2 million entries. When I do a ldapsearch to retrieve the entire tree the memory consumption grows and grows and will never stop before it ate the entire RAM and swap.
My slapd.conf file looks like this:
include /var/ldap/ldap-ds/etc/schema/core.schema include /var/ldap/ldap-ds/etc/schema/netmobile.schema include /var/ldap/ldap-ds/etc/schema/netmobile.acc.schema include /var/ldap/ldap-ds/etc/schema/netmobile.zMRDB.schema
# Define global ACLs to disable default read access.
# Do not enable referrals until AFTER you have a working directory # service AND an understanding of referrals. #referral ldap://root.openldap.org
pidfile /var/ldap/ldap-ds/var/run/slapd.pid argsfile /var/ldap/ldap-ds/var/run/slapd.args
access to * by dn="cn=replica,ou=replica,ou=ldapaccounts,o=netmldap" write by users read by anonymous read
access to * by self write by users read by anonymous read
#threads 2
# all logging #loglevel -1 # only syncrepl logging loglevel 16640
####################################################################### # BDB database definitions ####################################################################### database monitor
access to dn.subtree="cn=Monitor" by dn.exact="cn=admin,ou=netm,ou=people,o=netmldap" write by dn.exact="cn=admin,ou=netm,ou=people,o=netmldap" read by * none
database bdb suffix "cn=accesslog" rootdn "cn=Admin,o=netmldap" directory /var/ldap/ldap-ds/cn=accesslog
index default eq index entryCSN,objectClass,reqEnd,reqResult,reqStart
overlay syncprov syncprov-nopresent TRUE syncprov-reloadhint TRUE
limits dn.exact="cn=Admin,o=netmldap" time.soft=unlimited time.hard=unlimited size.soft=unlimited size.hard=unlimited
database config rootdn "cn=admin,cn=config" rootpw config
database bdb suffix "o=netmldap" rootdn "cn=Admin,o=netmldap" # Cleartext passwords, especially for the rootdn, should # be avoid. See slappasswd(8) and slapd.conf(5) for details. # Use of strong authentication encouraged. rootpw {SSHA}5ycg8tSV/i0Z99FKaylr0Az5x1nBA8TC directory /var/ldap/ldap-ds/o=netmldap
#dbcachesize 1000000 cachesize 1000 idlcachesize 3000 cachefree 500 searchstack 8
#readonly on
# Indices to maintain index default pres,eq index cn index netmzMRDBPhoneNumber index netmAccNumAddr index netmAccNumPort index netmLogin2 index netmPortalName index netmCarrierID # new indices - 2007-01-15 index netmLogin pres,eq index netmClientContractUniqueName index netmPrivateMail index netmContactTecEmail index netmContactBillingEmail index netmFirmEmail index netmContactCommEmail index netmzMRDBCarrier index netmzMRDBblacklistLA index netmzMRDBPortingDate index objectClass pres,eq index entryCSN,entryUUID eq
# Save the time that the entry gets modified lastmod on
overlay syncprov syncprov-checkpoint 1000 60
overlay accesslog logdb cn=accesslog logops writes logsuccess TRUE # scan the accesslog DB every day, and purge entries older than 7 days logpurge 07+00:00 01+00:00
limits dn.exact="cn=Admin,o=netmldap" time.soft=unlimited time.hard=unlimited size.soft=unlimited size.hard=unlimited
sizelimit -1
The DB_CONFIG look like this
set_cachesize 0 8435456 1 set_lg_regionmax 262144 set_lg_bsize 2097152
# will automatically remove transaction logs # this setting isn't recommended set_flags DB_LOG_AUTOREMOVE
When I start slapd is consumes that much RAM total kB 478476
After doing the search it looks like this total kB 937316
I've done a pmap -x to see where the memory goes but this process is marked as anonymous. So I don't know how to get this memory back. I have to restart the LDAP server to get this memory released.
I would appreciate if someone could help me to solve this issue. Thanks Thorsten
On Mon, Nov 17, 2008 at 02:43:48PM +0100, tkohlhepp wrote:
I've discovered a major memory issue. We have an OpenLDAP server (2.4.11) running with almost 2 million entries. When I do a ldapsearch to retrieve the entire tree the memory consumption grows and grows and will never stop before it ate the entire RAM and swap.
The figures you give suggest that the maximum size reached is under 1GB. That does not seem bad for 2M entries.
Retrieving 2M entries in a single operation is going to tax any LDAP server, especially if you do not request paged results. Consider what it must do:
1) Make a list of every entry ID 2) Retrieve the data for every entry 3) Build a message containing 2M entries 4) Send the message
Steps 2 and 3 are going to consume a *lot* of memory. Just looking at the number of attributes you are indexing, I suspect that your entries are over 1kb each. That gives a result message of 2M * 1kb = 2GB (at least). All of this must fit into memory before it can be sent.
#dbcachesize 1000000 cachesize 1000 idlcachesize 3000 cachefree 500 searchstack 8
The DB_CONFIG look like this
set_cachesize 0 8435456 1 set_lg_regionmax 262144 set_lg_bsize 2097152
All of those numbers seem very small for a directory with 2M entries. It depends how you use it of course, but 'fetch every entry' is going to kill it.
When I start slapd is consumes that much RAM total kB 478476
After doing the search it looks like this total kB 937316
If that represents 'all of memory and all of swap' then you need to add more of both!
If the normal service runs OK with these numbers but it is just the 'retrieve all' operation that gives trouble, then why not find a different way to do that one job? slapcat perhaps?
Andrew
Andrew Findlay writes:
Retrieving 2M entries in a single operation is going to tax any LDAP server, especially if you do not request paged results. Consider what it must do:
- Make a list of every entry ID
- Retrieve the data for every entry
- Build a message containing 2M entries
- Send the message
No, each entry is sent in a separate message.
However OpenLDAP does build a list of all entry IDs to examine and possibly, subject to indexes for the filters. And it must readlock all these entries so that an update operation won't mess things up while it is sending, and so updates will be atomic as seen by the search request.
I don't know what BDB does when there are 2M entries to examine though. Maybe it just gives up and examines all entries, as LDBM did.
Hi,
Hallvard B Furuseth wrote:
Andrew Findlay writes:
Retrieving 2M entries in a single operation is going to tax any LDAP server, especially if you do not request paged results. Consider what it must do:
- Make a list of every entry ID
- Retrieve the data for every entry
- Build a message containing 2M entries
- Send the message
No, each entry is sent in a separate message.
I also thought it would send each message separately, because to build a message with 2M entries wouldn't make sense. It would also take much longer to respond. The first entry of the search is returned immediately which indicates that each entry is sent separately.
However OpenLDAP does build a list of all entry IDs to examine and possibly, subject to indexes for the filters. And it must readlock all these entries so that an update operation won't mess things up while it is sending, and so updates will be atomic as seen by the search request.
I don't know what BDB does when there are 2M entries to examine though. Maybe it just gives up and examines all entries, as LDBM did.
The total memory of the server is 4 GB and swap 2 GB. So it will survive even if we pull the entire tree by using ldapsearch. But we would like to put other services as well on the same server which could slow things down if LDAP is already using a lot of memory.
I know doing an ldapsearch "(objectClass=*)" is a bad way to get all entries, but I want to make sure that a bad formatted search can't slow down the entire server by consuming a lot of memory.
Another question why isn't it releasing the used memory after the search finished?
Thanks Thorsten
Thorsten Kohlhepp wrote:
Hi,
Hallvard B Furuseth wrote:
Andrew Findlay writes:
Retrieving 2M entries in a single operation is going to tax any LDAP server, especially if you do not request paged results. Consider what it must do:
- Make a list of every entry ID
- Retrieve the data for every entry
- Build a message containing 2M entries
- Send the message
No, each entry is sent in a separate message.
I also thought it would send each message separately, because to build a message with 2M entries wouldn't make sense. It would also take much longer to respond. The first entry of the search is returned immediately which indicates that each entry is sent separately.
There's no need to experiment. This is clearly indicated in the protocol specification (RFC4511, but it has always been like this)
However OpenLDAP does build a list of all entry IDs to examine and possibly, subject to indexes for the filters. And it must readlock all these entries so that an update operation won't mess things up while it is sending, and so updates will be atomic as seen by the search request.
I don't know what BDB does when there are 2M entries to examine though. Maybe it just gives up and examines all entries, as LDBM did.
The total memory of the server is 4 GB and swap 2 GB. So it will survive even if we pull the entire tree by using ldapsearch. But we would like to put other services as well on the same server which could slow things down if LDAP is already using a lot of memory.
I know doing an ldapsearch "(objectClass=*)" is a bad way to get all entries,
Too bad there's no other way. If you find any, please let us know.
but I want to make sure that a bad formatted search can't slow down the entire server by consuming a lot of memory.
If you want to inhibit expensive searches, tke a look at the "limits" statement of slapd.conf(5). In detail, consider limits size.unchecked.
Another question why isn't it releasing the used memory after the search finished?
Depending on the backend and on the database, caching may take place (and should, if you want performances). For details about Berkeley DB caching, see Sleepycat's documentation. For details about back-bdb and back-hdb caching, see cachesize, idlcachesize, dncachesize in slapd-bdb(5), and http://www.openldap.org/doc/admin24/tuning.html.
p.
Ing. Pierangelo Masarati OpenLDAP Core Team
SysNet s.r.l. via Dossi, 8 - 27100 Pavia - ITALIA http://www.sys-net.it ----------------------------------- Office: +39 02 23998309 Mobile: +39 333 4963172 Fax: +39 0382 476497 Email: ando@sys-net.it -----------------------------------
Pierangelo Masarati wrote:
Thorsten Kohlhepp wrote:
Hi,
Hallvard B Furuseth wrote:
Andrew Findlay writes:
Retrieving 2M entries in a single operation is going to tax any LDAP server, especially if you do not request paged results. Consider what it must do:
- Make a list of every entry ID
- Retrieve the data for every entry
- Build a message containing 2M entries
- Send the message
No, each entry is sent in a separate message.
I also thought it would send each message separately, because to build a message with 2M entries wouldn't make sense. It would also take much longer to respond. The first entry of the search is returned immediately which indicates that each entry is sent separately.
There's no need to experiment. This is clearly indicated in the protocol specification (RFC4511, but it has always been like this)
However OpenLDAP does build a list of all entry IDs to examine and possibly, subject to indexes for the filters. And it must readlock all these entries so that an update operation won't mess things up while it is sending, and so updates will be atomic as seen by the search request.
I don't know what BDB does when there are 2M entries to examine though. Maybe it just gives up and examines all entries, as LDBM did.
The total memory of the server is 4 GB and swap 2 GB. So it will survive even if we pull the entire tree by using ldapsearch. But we would like to put other services as well on the same server which could slow things down if LDAP is already using a lot of memory.
I know doing an ldapsearch "(objectClass=*)" is a bad way to get all entries,
Too bad there's no other way. If you find any, please let us know.
but I want to make sure that a bad formatted search can't slow down the entire server by consuming a lot of memory.
If you want to inhibit expensive searches, tke a look at the "limits" statement of slapd.conf(5). In detail, consider limits size.unchecked.
Another question why isn't it releasing the used memory after the search finished?
Depending on the backend and on the database, caching may take place (and should, if you want performances). For details about Berkeley DB caching, see Sleepycat's documentation. For details about back-bdb and back-hdb caching, see cachesize, idlcachesize, dncachesize in slapd-bdb(5), and http://www.openldap.org/doc/admin24/tuning.html.
Of course it will cache the entries, but I defined a cache size of 8.4m, an entry cachesize of 1000 and an idlcachesize of 1000. When the search finishes it consumes 937316 kB. This is a way over than the cachesize. What do I wrong?
p.
Ing. Pierangelo Masarati OpenLDAP Core Team
SysNet s.r.l. via Dossi, 8 - 27100 Pavia - ITALIA http://www.sys-net.it
Office: +39 02 23998309 Mobile: +39 333 4963172 Fax: +39 0382 476497 Email: ando@sys-net.it
Thanks Thorsten
Thorsten Kohlhepp wrote:
Pierangelo Masarati wrote:
Thorsten Kohlhepp wrote:
Hi,
Hallvard B Furuseth wrote:
Andrew Findlay writes:
Retrieving 2M entries in a single operation is going to tax any LDAP server, especially if you do not request paged results. Consider what it must do:
- Make a list of every entry ID
- Retrieve the data for every entry
- Build a message containing 2M entries
- Send the message
No, each entry is sent in a separate message.
I also thought it would send each message separately, because to build a message with 2M entries wouldn't make sense. It would also take much longer to respond. The first entry of the search is returned immediately which indicates that each entry is sent separately.
There's no need to experiment. This is clearly indicated in the protocol specification (RFC4511, but it has always been like this)
However OpenLDAP does build a list of all entry IDs to examine and possibly, subject to indexes for the filters. And it must readlock all these entries so that an update operation won't mess things up while it is sending, and so updates will be atomic as seen by the search request.
I don't know what BDB does when there are 2M entries to examine though. Maybe it just gives up and examines all entries, as LDBM did.
The total memory of the server is 4 GB and swap 2 GB. So it will survive even if we pull the entire tree by using ldapsearch. But we would like to put other services as well on the same server which could slow things down if LDAP is already using a lot of memory.
I know doing an ldapsearch "(objectClass=*)" is a bad way to get all entries,
Too bad there's no other way. If you find any, please let us know.
but I want to make sure that a bad formatted search can't slow down the entire server by consuming a lot of memory.
If you want to inhibit expensive searches, tke a look at the "limits" statement of slapd.conf(5). In detail, consider limits size.unchecked.
Another question why isn't it releasing the used memory after the search finished?
Depending on the backend and on the database, caching may take place (and should, if you want performances). For details about Berkeley DB caching, see Sleepycat's documentation. For details about back-bdb and back-hdb caching, see cachesize, idlcachesize, dncachesize in slapd-bdb(5), and http://www.openldap.org/doc/admin24/tuning.html.
Of course it will cache the entries, but I defined a cache size of 8.4m, an entry cachesize of 1000 and an idlcachesize of 1000. When the search finishes it consumes 937316 kB. This is a way over than the cachesize. What do I wrong?
What is 8.4m? 8.4 minutes? Berkeley BDB cache size is expressed by two numbers, the first one counting the GB (Giga bytes) and the second counting the MB (Mega bytes). If by "8.4m" you mean 8.4 MB, then your cache is very likely way underestimated.
An entry cachesize of 1000 means 1000 entries. So it may well mean lots of kB (or MB) depending on the actual size of your entries (the size of an entry is usually more than twice that of its textual representation, since all values are stored in pretty and normalized form, plus overhead).
In any case, if you fear leaks, please do run slapd under valgrind and report any issue. It will help making slapd better.
p.
Ing. Pierangelo Masarati OpenLDAP Core Team
SysNet s.r.l. via Dossi, 8 - 27100 Pavia - ITALIA http://www.sys-net.it ----------------------------------- Office: +39 02 23998309 Mobile: +39 333 4963172 Fax: +39 0382 476497 Email: ando@sys-net.it -----------------------------------
Pierangelo Masarati wrote:
Thorsten Kohlhepp wrote:
Pierangelo Masarati wrote:
Thorsten Kohlhepp wrote:
Hi,
Hallvard B Furuseth wrote:
Andrew Findlay writes:
Retrieving 2M entries in a single operation is going to tax any LDAP server, especially if you do not request paged results. Consider what it must do:
- Make a list of every entry ID
- Retrieve the data for every entry
- Build a message containing 2M entries
- Send the message
No, each entry is sent in a separate message.
I also thought it would send each message separately, because to build a message with 2M entries wouldn't make sense. It would also take much longer to respond. The first entry of the search is returned immediately which indicates that each entry is sent separately.
There's no need to experiment. This is clearly indicated in the protocol specification (RFC4511, but it has always been like this)
However OpenLDAP does build a list of all entry IDs to examine and possibly, subject to indexes for the filters. And it must readlock all these entries so that an update operation won't mess things up while it is sending, and so updates will be atomic as seen by the search request.
I don't know what BDB does when there are 2M entries to examine though. Maybe it just gives up and examines all entries, as LDBM did.
The total memory of the server is 4 GB and swap 2 GB. So it will survive even if we pull the entire tree by using ldapsearch. But we would like to put other services as well on the same server which could slow things down if LDAP is already using a lot of memory.
I know doing an ldapsearch "(objectClass=*)" is a bad way to get all entries,
Too bad there's no other way. If you find any, please let us know.
but I want to make sure that a bad formatted search can't slow down the entire server by consuming a lot of memory.
If you want to inhibit expensive searches, tke a look at the "limits" statement of slapd.conf(5). In detail, consider limits size.unchecked.
Another question why isn't it releasing the used memory after the search finished?
Depending on the backend and on the database, caching may take place (and should, if you want performances). For details about Berkeley DB caching, see Sleepycat's documentation. For details about back-bdb and back-hdb caching, see cachesize, idlcachesize, dncachesize in slapd-bdb(5), and http://www.openldap.org/doc/admin24/tuning.html.
Of course it will cache the entries, but I defined a cache size of 8.4m, an entry cachesize of 1000 and an idlcachesize of 1000. When the search finishes it consumes 937316 kB. This is a way over than the cachesize. What do I wrong?
What is 8.4m? 8.4 minutes? Berkeley BDB cache size is expressed by two numbers, the first one counting the GB (Giga bytes) and the second counting the MB (Mega bytes). If by "8.4m" you mean 8.4 MB, then your cache is very likely way underestimated.
I meant 8.4 MB. Actually the function DB->set_cachesize contains 3 numbers. http://www.oracle.com/technology/documentation/berkeley-db/db/api_c/db_set_c... In my DB_CONFIG I've set set_cachesize 0 8435456 1 which means to reserve a cachesize of 8.4MB one time. This is absolutely working fine because dbstat -m shows 1 Number of caches 1 Maximum number of caches 10MB 64KB Pool individual cache size
An entry cachesize of 1000 means 1000 entries. So it may well mean lots of kB (or MB) depending on the actual size of your entries (the size of an entry is usually more than twice that of its textual representation, since all values are stored in pretty and normalized form, plus overhead).
In any case, if you fear leaks, please do run slapd under valgrind and report any issue. It will help making slapd better.
p.
Ing. Pierangelo Masarati OpenLDAP Core Team
SysNet s.r.l. via Dossi, 8 - 27100 Pavia - ITALIA http://www.sys-net.it
Office: +39 02 23998309 Mobile: +39 333 4963172 Fax: +39 0382 476497 Email: ando@sys-net.it
And now comes the strange thing: slapd not running free -k total used free shared buffers cached Mem: 4194304 2768584 1425720 0 134096 2441308 -/+ buffers/cache: 193180 4001124 Swap: 1052248 64 1052184
slapd running free -k total used free shared buffers cached Mem: 4194304 2772980 1421324 0 134124 2441320 -/+ buffers/cache: 197536 3996768 Swap: 1052248 64 1052184
So it took only 4MB which is fine because the cache isn't used yet. After running ldapsearch ... "(objectClass=*)" by using no sizelimt free -k total used free shared buffers cached Mem: 4194304 3242360 951944 0 134368 2451296 -/+ buffers/cache: 656696 3537608 Swap: 1052248 64 1052184
As you can see it took more than 400 MB. And this memory isn't released unless I restart the LDAP server. I don't know where this memory went, because if I do a pmap I get this line which were increasing during the search. 000000001f511000 464100 - - - rw--- [ anon ]
Thanks for all of your help Ciao Thorsten
openldap-technical@openldap.org