Statistically, that should be relevant. I mean, I usually do.
i=0; while [ $i -lt 100 ]; do pstack <MYPID> > pstack.$i; (( i+=1 )); done;
Yes no sleep, just a burst of pstacks. That is statistically as correct as any sampling based profilers would tell, without the complexity of having to install one such tool (kernel prereq, etc...) and you can collect that in less than a minute.
Sometimes though that can considered as hard to read for people not used to it.
If you pass me with your output, I may try to help.
Best Regards ++Cyrille From: Luca Polidoro [mailto:luca.polidoro@gmail.com] Sent: Friday, September 06, 2013 3:08 PM To: Maucci, Cyrille Cc: openldap-technical@openldap.org Subject: Re: Slapd High CPU usage on Solaris 9
Hi, I have already done these tests, but the result provides little information, none of which is useful for directing the analysis.
2013/9/6 Maucci, Cyrille <cyrille.maucci@hp.commailto:cyrille.maucci@hp.com> When I myself face such a problem, I usually pstack the process a few times to very quickly know what the guy is doing. And that usually gives me a good clue.
++Cyrille
From: openldap-technical-bounces@OpenLDAP.orgmailto:openldap-technical-bounces@OpenLDAP.org [mailto:openldap-technical-bounces@OpenLDAP.orgmailto:openldap-technical-bounces@OpenLDAP.org] On Behalf Of Luca Polidoro Sent: Monday, August 12, 2013 3:31 PM To: openldap-technical@openldap.orgmailto:openldap-technical@openldap.org Subject: Slapd High CPU usage on Solaris 9
Hello,
I am writing to to submit a case that has been happening in the last 2 weeks in our infrastructure. This is structured as follows:
1 provider: Solaris 9 SPARC - Sun Fire V490 - last OS patch level CPU: 4-1500 Mhz RAM: 32 GB
OpenLDAP version used: Berkeley DB 2.4.23 and 4.8.30 (with database bdb) all 64-bit
18 consumer: Solaris 9 SPARC - last OS patch level with different types of features (CPU, RAM)
On the following consumer products:
Consumer 1: Solaris 9 SPARC - Sun Fire 480R - last OS patch level CPU: 4-900 Mhz RAM: 8 GB
Consumer 2: Solaris 9 SPARC - Sun Fire 480R - last OS patch level CPU: 4-1050 Mhz RAM: 8 GB
Consumer 3: Solaris 9 SPARC - Sun Fire 480R - last OS patch level CPU: 4-1050 Mhz RAM: 8 GB
Consumer 4: Solaris 9 SPARC - Sun Fire V210 - last OS patch level CPU: 2-1336 Mhz RAM: 8 GB
we are noticing an increase in the cpu used by the slapd process. In fact, the process is constantly between 85% and 95%, and became completely unusable and then we are forced to restart. LDAP with 1.000.000 objects.
This is the consumer's slapd.conf (I have omitted parts of the ACL, includes, etc..): # See slapd.conf(5) for details on configuration options. # This file should NOT be world readable. #
# # VERSION v2 - Digital Tru64 # allow bind_v2
Some include ...
# # tuning parameters - START # ------------------------------ # conn_max_pending 1000 conn_max_pending_auth 1000
idletimeout 500 sizelimit unlimited threads 8 timelimit 500 disallow bind_anon
# # tuning parameters - END # ---------------------------- #
...
####################################################################### # bdb database definitions #######################################################################
database bdb suffix "xxxxxxxxxxxx" rootdn "cn=root,ou=ldapusers,xxxxx"
directory /var/openldap-2.4.23_64/var/openldap-data #####disallow limit for syncuser limits dn.children="ou=syncusers,xxxx" size=unlimited index objectClass,entryCSN,entryUUID eq index ou eq,sub,subinitial,subany,subfinal index uidOwner eq index uid eq index memberUid eq
#shm_key 1100 cachesize 1000000 cachefree 10000 dncachesize 1000000 idlcachesize 1000000 searchstack 16 checkpoint 1024 10
overlay ppolicy ppolicy_default "cn=Standard,ou=Policies,xxxx" ppolicy_use_lockout
############################SYNCREPL CONF syncrepl rid=011 provider=ldap://xxxxxx type=refreshAndPersist interval=00:00:15:00 retry="15 10 120 +" searchbase="xxxxx" filter="(objectClass=*)" attrs="*,+" scope=sub schemachecking=on bindmethod=simple binddn="xxxxxx" credentials=xxxx ############################SYNCREPL CONF
These are the bdb files:
420M dn2id.bdb 30M entryCSN.bdb 32M entryUUID.bdb 1,4G id2entry.bdb 18M memberUid.bdb 4,9M objectClass.bdb 5,3M ou.bdb 17M uid.bdb 17M uidOwner.bdb
this is DB CONFIG:
-----------------------------------------------------------
########################################## ########################################### #set_cachesize 0 300000000 10 #set_lg_regionmax 262144 #set_lg_bsize 2097152 ########################################### ########################################### # replaces lockdetect directive #set_lk_detect DB_LOCK_EXPIRE set_lk_detect DB_LOCK_DEFAULT
# uncomment if dbnosync required #AGGIUNTO TUTTO #set_flags DB_TXN_WRITE_NOSYNC ####AGGIUNTO set_flags DB_LOG_AUTOREMOVE # multiple set_flags directives allowed
# sets max log size = 5M (BDB default=10M) set_lg_max 25242880 set_lg_dir /var/openldap-2.4.23_64/logs
set_cachesize 2 274726912 1 # sets a database cache of 5M and # allows fragmentation # does NOT replace slapd.conf cachesize # this is a database parameter
#txn_checkpoint 128 15 0 # replaces checkpoint in slap.conf # writes checkpoint if 128K written or every 15 mins # 0 = no writes - no update set_lk_max_locks 2500 set_lk_max_lockers 2500 set_lk_max_objects 2500
---------------------------------------------------
We have tried to change the number of threads bringing them to 16, we lowered the parameters idletimeout and timelimit, but without result.
Appreciate your feedback. Thanks, Luca