Hi Quanah,
I moved to OpenLDAP 2.4.18 and patched B DB 4.7.25 with all 4 patches from
oracle.
I DIDN't change slapd.config at all
i reduced the number of entries to a total of 3437278.
[root@l01lnp2 ~]# du -c -h /var/lib/ldap/*.bdb
200K /var/lib/ldap/bestMatchPrefix.bdb
982M /var/lib/ldap/dn2id.bdb
2.4G /var/lib/ldap/id2entry.bdb
1.8M /var/lib/ldap/objectClass.bdb
1.2M /var/lib/ldap/originatorPrefixID.bdb
48M /var/lib/ldap/uniqueID.bdb
3.4G total <= interesting ... almost the same as number of entries :)
changed DB_CONFIG to cache 7 GB:
set_cachesize 7 0 1
set_lg_regionmax 262144
set_lg_bsize 2097152
my system has 10 GB of RAM and the situation now is:
[root@l01lnp2 ~]# free
total used free shared buffers cached
Mem: 10234924 10176544 58380 0 2144 3786596
-/+ buffers/cache: 6387804 3847120
Swap: 4096564 753572 3342992
[root@l01lnp2 ~]#
When i'm doing ldapsearch (time ldapsearch -h localhost -x -b
ou=bestMatchPrefixList,ou=sipDirektor,dc=ot,dc=hr -D cn=admin,dc=ot,dc=hr
-w pero99) before i actuall add anything with ldapadd, the search completes
within 40 seconds. slapd process takes 24 - 26% memory.
After I add new entries (just 2 more) and perform the same search, it hangs
after a while. When it ldapsearch finishes returning entries, i see slapd
process memory starts growing .... it is taking almost everything....
reaching 97% ?!?!
It is always like this.... the search throws all entries and then waits for
some time .. it is almost random 60 seconds - 6 minutes to actually exit.
Please can you take a loot to strace logs i've attached in my previous
e-mail... as asoon as the ldapsearch stops returning entries i see a lot of
jubrish there...
Here is slapd process memory growth:
*top - 16:42:22 up* 4 days, 1:02, 2 users, load average: 2.13, 0.67, 0.23
Tasks: 119 total, 1 running, 118 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.8%us, 0.2%sy, 0.0%ni, 70.0%id, 28.8%wa, 0.0%hi, 0.2%si,
0.0%st
Mem: 10234924k total, 10177568k used, 57356k free, 6676k buffers
Swap: 4096564k total, 36516k used, 4060048k free, 3603688k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9404 ldap 25 0 13.3g 8.8g 2.8g S 4.0 *89.7 * 1:13.49 slapd * *
1 root 15 0 10344 372 344 S 0.0 0.0 0:01.69 init
2 root RT -5 0 0 0 S 0.0 0.0 0:00.06 migration/0
Tasks: 117 total, 1 running, 116 sleeping, 0 stopped, 0 zombie
Cpu(s): 7.2%us, 0.7%sy, 0.0%ni, 67.5%id, 24.3%wa, 0.0%hi, 0.3%si,
0.0%st
Mem: 10234924k total, 10177968k used, 56956k free, 6656k buffers
Swap: 4096564k total, 36516k used, 4060048k free, 3580356k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9404 ldap 25 0 13.3g 8.9g 2.9g S 30.3 *90.9* 1:16.76 slapd
325 root 10 -5 0 0 0 S 0.7 0.0 5:37.11 kswapd0
8458 root 15 0 0 0 0 D 0.3 0.0 0:02.02 pdflush
Tasks: 117 total, 1 running, 116 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.0%us, 0.3%sy, 0.0%ni, 72.3%id, 26.1%wa, 0.0%hi, 0.3%si,
0.0%st
Mem: 10234924k total, 10180560k used, 54364k free, 6140k buffers
Swap: 4096564k total, 36516k used, 4060048k free, 3488164k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9404 ldap 25 0 13.4g 9.3g 3.2g S 4.7 *95.5* 1:28.86 slapd
8458 root 15 0 0 0 0 D 0.7 0.0 0:02.20 pdflush
Tasks: 117 total, 1 running, 116 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.9%us, 0.4%sy, 0.0%ni, 70.5%id, 28.0%wa, 0.0%hi, 0.2%si,
0.0%st
Mem: 10234924k total, 10177812k used, 57112k free, 3492k buffers
Swap: 4096564k total, 36516k used, 4060048k free, 3481476k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9404 ldap 25 0 13.4g 9.4g 3.2g S 4.3* 95.9* 1:30.39 slapd * *
325 root 10 -5 0 0 0 S 0.7 0.0 5:38.08 kswapd0
*top - 16:45:01 up *4 days, 1:05, 2 users, load average: 1.91, 1.40, 0.59
Tasks: 117 total, 1 running, 116 sleeping, 0 stopped, 0 zombie
Cpu(s): 3.2%us, 0.2%sy, 0.0%ni, 75.0%id, 21.4%wa, 0.0%hi, 0.1%si,
0.0%st
Mem: 10234924k total, 10179744k used, 55180k free, 396k buffers
Swap: 4096564k total, 42328k used, 4054236k free, 3473624k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9404 ldap 25 0 13.5g 9.4g 3.3g S 13.6 *96.7* 1:33.44 slapd
9490 root 15 0 0 0 0 S 0.3 0.0 0:00.31 pdflush
*top - 16:45:33 up *4 days, 1:05, 2 users, load average: 1.55, 1.36, 0.60
Tasks: 117 total, 1 running, 116 sleeping, 0 stopped, 0 zombie
Cpu(s): 2.7%us, 0.2%sy, 0.0%ni, 74.7%id, 22.3%wa, 0.0%hi, 0.1%si,
0.0%st
Mem: 10234924k total, 10180100k used, 54824k free, 652k buffers
Swap: 4096564k total, 118616k used, 3977948k free, 3521232k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9404 ldap 25 0 13.5g 9.4g 3.3g S 10.6 *96.6* 1:37.36 slapd
325 root 10 -5 0 0 0 S 0.3 0.0 5:38.63 kswapd0
This looks to me as a memory leak bug to me.
Tihomir.
On Thu, Sep 10, 2009 at 9:37 PM, Quanah Gibson-Mount <quanah(a)zimbra.com>wrote:
> --On Thursday, September 10, 2009 8:56 PM +0200 Tihomir Culjaga <
> tculjaga(a)gmail.com> wrote:
>
> So, the situation is that i have 2 ldif files i'm recreating the database
>> from.
>>
>> /usr/local/libexec/slapadd -l /home/tculjaga/file2.ldif -f
>> /usr/local/etc/openldap/slapd.conf
>> /usr/local/libexec/slapadd -l /home/tculjaga/file2.ldif -f
>> /usr/local/etc/openldap/slapd.conf
>>
>
> I would suggest you just make these a single file, so all the work can be
> done at one time.
>
> I tried to re-index with /usr/local/libexec/slapindex -f
>> /usr/local/etc/openldap/slapd.conf -v
>> restart slapd process, restart the machine ... it is always the same
>> issue.
>>
>
> Nothing here indicates a problem with your indices. Running slapindex
> repeatedly is a waste of your time.
>
> [root@l01lnp2 traces]# /usr/local/libexec/slapd -V
>> @(#) $OpenLDAP: slapd 2.4.16 (Sep 9 2009 14:39:44) $
>> root@l01lnp2:/home/tculjaga/openldap-2.4.16/servers/slapd
>>
>
> I would strongly urge you to upgrade to 2.4.18 (for reasons I will note
> further down)
>
>
> [root@l01lnp2 traces]# /usr/local/BerkeleyDB.4.7/bin/db_stat -V
>> Berkeley DB 4.7.25: (May 15, 2008) - unpached!
>>
>
> You need to rebuild BDB 4.7.25 with the 4 patches from Oracle. There are
> known issues when running BDB 4.7 without them.
>
> [root@l01lnp2 traces]# du -c -h /var/lib/ldap/*.bdb
>> 200K /var/lib/ldap/bestMatchPrefix.bdb
>> 3.8G /var/lib/ldap/dn2id.bdb
>> 6.2G /var/lib/ldap/id2entry.bdb
>> 1.8M /var/lib/ldap/objectClass.bdb
>> 1.2M /var/lib/ldap/originatorPrefixID.bdb
>> 48M /var/lib/ldap/uniqueID.bdb
>> 10G total
>>
>
> Since your database is a total of 10 GB in size, for slapadd to work at
> optimum efficiency, you need at least 10GB of cache for your DB_CONFIG file.
> Unfortunately, you only have 10GB of RAM. Essentially, your system is
> under powered for your database size.
>
>
>
> [tculjaga@l01lnp2 ~]$ cat ot.ldif | grep -c "dn: "
>> 101588
>> [tculjaga@l01lnp2 ~]$ cat l01sipdir1.ldif | grep -c "dn: "
>> 9994864
>> [tculjaga@l01lnp2 ~]$
>>
>
> So you have 10,096,452 entries total.
>
> [root@l01lnp2 traces]# cat /var/lib/ldap/DB_CONFIG | grep -v "#"
>>
>> set_cachesize 0 3221225472 1
>> set_lg_regionmax 262144
>> set_lg_bsize 2097152
>>
>
> You only have a 3GB DB cachesize configured here. Expect things to perform
> sub optimally. It would have been easier to set this by going
>
> set_cachesize 3 0 1
>
> Which would have the same effect, since the first number is the number of
> gigabytes to allocate.
>
> Please find attached slapd.conf
>>
>
> Ok, so the relevant bits from here are:
>
> cachesize 2500000
> idlcachesize 7500000
> cachefree 1000
>
> Which means you have a cachesize of 2.5 million, an idlcachesize of 7.5
> million, and (with OL 2.4.16) a dncachesize of 5 million.
>
> I would highly advise you upgrade to OpenLDAP 2.4.18, and change the
> slapd.conf settings to:
>
> dncachesize 0 (which means unlimited).
>
> And setting no cache or idlcachesize, and fixing your DB_CONFIG. But you
> also need to buy a substantial amount of RAM for a DB of this size. :P I
> would advise you upgrade to at least 32GB total. Then you can more
> optimally tune the system.
>
>
> --Quanah
>
> --
>
> Quanah Gibson-Mount
> Principal Software Engineer
> Zimbra, Inc
> --------------------
> Zimbra :: the leader in open source messaging and collaboration
>