OpenLdap software community,
I would like to request your experience for some issue I'm facing trying
to use jmeter for testing openldap load test.
I made a script that keeps around 10 simultaneous thread connections
with a ramp time of 1 second. I also use a csv file with the records
filters so the bind+search+unbind always pass by all records existent in
the DB.
The issue I'm facing is that after sometime I see some connections being
refused(no bind) but after sometime the connections start to be accepted
again.
Analyzing the logs and everything else I could notice the following
things happening in the system :
1) I see that all connections are accept by the openldap machine and
then the TIME_WAIT connections start to increase(what make sense);
[root@brtldp11 ~]# netstat -an --protocol=inet -t|wc -l
11636 (around this)
2) After normally the number of connections(ESTABLISHED, TIME_WAIT,
etc) increase to values normally over 12,000 then it start to reduce;
3) When these connections start to reduce then I start to see after
sometime in the output error for jmeter connections ;
<sample t="177" lt="0" ts="1251214974190" s="false" lb="LDAP Bind"
rc="800" rm="javax.naming.CommunicationException: 10.142.15.170:389
[Root exception is java.net.ConnectException: Cannot assign requested
address]" tn="Thread Group1 1-2" dt="text" by="409">
<responseData
class="java.lang.String"><ldapanswer><operation><opertype>bind</opertype>
<baseobj>ou=CONTENT,o=domain,c=fr</baseobj>
<binddn>cn=admin,ou=CONTENT,o=domain,c=fr</binddn>
<connectionTO>3000</connectionTO>
</operation>
<responsecode>800</responsecode>
<responsemessage>javax.naming.CommunicationException:
10.142.15.170:389 [Root exception is java.net.ConnectException: Cannot
assign requested address]</responsemessage>
</ldapanswer>
</responseData>
</sample>
4) Also start to see at syslog(LOG4) for openldap(loglevel 2) messages
like below :
Aug 25 12:42:59 brtldp11 slapd[2720]: connection_read(27): no connection!
Aug 25 12:42:59 brtldp11 slapd[2720]: connection_read(43): no connection!
Aug 25 12:42:59 brtldp11 slapd[2720]: connection_read(28): no connection!
Aug 25 12:42:59 brtldp11 slapd[2720]: connection_read(40): no connection!
Aug 25 12:43:00 brtldp11 slapd[2720]: connection_read(37): no connection!
Aug 25 12:43:00 brtldp11 slapd[2720]: connection_read(30): no connection!
Aug 25 12:43:01 brtldp11 slapd[2720]: connection_read(23): no connection!
Aug 25 12:43:02 brtldp11 slapd[2720]: connection_read(36): no connection!
Aug 25 12:43:02 brtldp11 slapd[2720]: connection_read(41): no connection!
5) The netstat results decrease and all start over again until behavior
follows all steps again.
[root@brtldp11 ~]# netstat -an --protocol=inet -t|wc -l
9217
I tried to check all tuning parameter I knew and I could not find
explanation. Some parameters are :
[root@brtldp11 ~]# cat /proc/sys/net/ipv4/ip_local_port_range
32768 61000
[root@brtldp11 ~]# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 196607
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 65535
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 196607
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
I can be sure the test is trying to behave as expected since if I check
the ESTABLISHED connections I can see the simultaneous connections I
expect. Like :
[root@brtldp11 ~]# grep ESTA* log1.txt |wc -l
9
Like I configured in the test for around 10 simultaneous connections
with 1 second ramp. This was not supposed to be a so high load test but
this strange behavior is happening.
Even with this small number of connections I have this issue and then
time by time there are some connections refused. Also when I see this
situation I try to call ldapsearch and I saw :
[root@brtldp12 jakarta-jmeter-2.3.4]#
[root@brtldp12 jakarta-jmeter-2.3.4]# time ldapsearch -LLL -x -D
"cn=admin,ou=CONTENT,o=domain,c=fr" -w secret -b
"ou=CONTENT,o=domain,c=fr" -H ldap://10.142.15.170:389
'pnnumber=+554184096814'
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
real 0m0.059s
user 0m0.000s
sys 0m0.023s
Running after some seconds I can recover data from this query.
Any ideas? Can this be some openldap configuration? Anyone already
configured some heavy load test and OS tuning for openldap?
Best Regards,
Rodrigo.