OpenLdap software community,
I would like to request your experience for some issue I'm facing trying to use jmeter for testing openldap load test.
I made a script that keeps around 10 simultaneous thread connections with a ramp time of 1 second. I also use a csv file with the records filters so the bind+search+unbind always pass by all records existent in the DB.
The issue I'm facing is that after sometime I see some connections being refused(no bind) but after sometime the connections start to be accepted again.
Analyzing the logs and everything else I could notice the following things happening in the system :
1) I see that all connections are accept by the openldap machine and then the TIME_WAIT connections start to increase(what make sense); [root@brtldp11 ~]# netstat -an --protocol=inet -t|wc -l 11636 (around this)
2) After normally the number of connections(ESTABLISHED, TIME_WAIT, etc) increase to values normally over 12,000 then it start to reduce;
3) When these connections start to reduce then I start to see after sometime in the output error for jmeter connections ; <sample t="177" lt="0" ts="1251214974190" s="false" lb="LDAP Bind" rc="800" rm="javax.naming.CommunicationException: 10.142.15.170:389 [Root exception is java.net.ConnectException: Cannot assign requested address]" tn="Thread Group1 1-2" dt="text" by="409"> <responseData class="java.lang.String"><ldapanswer><operation><opertype>bind</opertype> <baseobj>ou=CONTENT,o=domain,c=fr</baseobj> <binddn>cn=admin,ou=CONTENT,o=domain,c=fr</binddn> <connectionTO>3000</connectionTO> </operation> <responsecode>800</responsecode> <responsemessage>javax.naming.CommunicationException: 10.142.15.170:389 [Root exception is java.net.ConnectException: Cannot assign requested address]</responsemessage> </ldapanswer> </responseData> </sample>
4) Also start to see at syslog(LOG4) for openldap(loglevel 2) messages like below : Aug 25 12:42:59 brtldp11 slapd[2720]: connection_read(27): no connection! Aug 25 12:42:59 brtldp11 slapd[2720]: connection_read(43): no connection! Aug 25 12:42:59 brtldp11 slapd[2720]: connection_read(28): no connection! Aug 25 12:42:59 brtldp11 slapd[2720]: connection_read(40): no connection! Aug 25 12:43:00 brtldp11 slapd[2720]: connection_read(37): no connection! Aug 25 12:43:00 brtldp11 slapd[2720]: connection_read(30): no connection! Aug 25 12:43:01 brtldp11 slapd[2720]: connection_read(23): no connection! Aug 25 12:43:02 brtldp11 slapd[2720]: connection_read(36): no connection! Aug 25 12:43:02 brtldp11 slapd[2720]: connection_read(41): no connection!
5) The netstat results decrease and all start over again until behavior follows all steps again. [root@brtldp11 ~]# netstat -an --protocol=inet -t|wc -l 9217
I tried to check all tuning parameter I knew and I could not find explanation. Some parameters are : [root@brtldp11 ~]# cat /proc/sys/net/ipv4/ip_local_port_range 32768 61000
[root@brtldp11 ~]# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 196607 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 65535 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 196607 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
I can be sure the test is trying to behave as expected since if I check the ESTABLISHED connections I can see the simultaneous connections I expect. Like :
[root@brtldp11 ~]# grep ESTA* log1.txt |wc -l 9
Like I configured in the test for around 10 simultaneous connections with 1 second ramp. This was not supposed to be a so high load test but this strange behavior is happening.
Even with this small number of connections I have this issue and then time by time there are some connections refused. Also when I see this situation I try to call ldapsearch and I saw :
[root@brtldp12 jakarta-jmeter-2.3.4]# [root@brtldp12 jakarta-jmeter-2.3.4]# time ldapsearch -LLL -x -D "cn=admin,ou=CONTENT,o=domain,c=fr" -w secret -b "ou=CONTENT,o=domain,c=fr" -H ldap://10.142.15.170:389 'pnnumber=+554184096814' ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
real 0m0.059s user 0m0.000s sys 0m0.023s
Running after some seconds I can recover data from this query.
Any ideas? Can this be some openldap configuration? Anyone already configured some heavy load test and OS tuning for openldap?
Best Regards,
Rodrigo.