I have an experiment, that may have too many moving parts to tease apart.
We're exploring migrating our LDAP server from a 32-bit OS to a 64-bit OS, and I've also explored the port to the mdb back end. My first shot at this does not show the performance the write-ups on MDB would have me expect.
If it matters, my old environment is a Dell R610 server, with 6G of RAM.
# uname -r 2.6.18-308.13.1.el5PAE # cat /etc/redhat-release CentOS release 5.8 (Final) # rpm -qf `which slapd` openldap-servers-2.3.43-25.el5_8.1
My ldapsearch invocations on this host take ~120 seconds.
I set up two Dell R610s, each with 12G or memory.
One environment uses the RPM produced by the LTB project, here named 'ltb':
root@ltb# uname -r 2.6.32-431.17.1.el6.x86_64 root@ltb# cat /etc/redhat-release CentOS release 6.5 (Final) root@ltb# rpm -qf /usr/local/openldap/libexec/slapd openldap-ltb-2.4.39-2.el6.x86_64
And the other uses the same OS, but uses CentOS's stock OpenLDAP RPM; here named 'stock':
root@stock# rpm -qf `which slapd` openldap-servers-2.4.23-34.el6_5.1.x86_64
I migrated a database of ~360k DNs, and underwent the cache tuning, following the suggestions here:
http://www.openldap.org/doc/admin24/tuning.html#Caching http://blog.monitor.us/2012/02/berkeley-db-performance-tuning/
And with those tunings, my ldapsearch command does outperform our old 32-but environment:
stock# time ldapsearch -x -w XXXXX -D "cn=manager,orgid=MyOrgID" -b orgid=MyOrgID dn | grep ^dn: | wc -l 3600071 real 0m27.147s user 0m19.216s sys 0m10.585s
However, my efforts to use the MDB backend yield slower responses, on the order of 175 seconds.
How I set up the MDB backend:
I copied my old slapd.conf over, and replaced my bdb backend with mdb settings.
- I set 'tool-threads' to 2 - I set 'maxsize' to 21474836480 [ 20G = 20 * ( 1024 * 1024 * 1024 ) ]
ltb# /usr/local/openldap/sbin/slaptest -f /etc/openldap/slapd.conf -F /etc/openldap/slapd.d config file testing succeeded ltb# chown -R ldap:ldap /etc/openldap/slapd.d
I was apparently able to import all 360k DNs without an error:
ltb# time setuidgid ldap /usr/local/openldap/sbin/slapadd \ -q -v -F /etc/openldap/slapd.d -b orgId=MyOrgID \ -l export_from_OLD.ldif >& log; echo $?
real 16m46.924s user 8m52.165s sys 3m44.859s 0
ltb# ls -ld data.mdb -rw------- 1 ldap ldap 21474836480 Jul 30 20:16 data.mdb ltb# du -c -h data.mdb 6.3G data.mdb 6.3G total
But, the same search looks worse here.
ltb# time ldapsearch -x -w XXXXX -D "cn=manager,orgid=MyOrgID" -b orgid=MyOrgID dn | grep ^dn: | wc -l 3600071 real 2m55.482s user 0m25.459s sys 0m23.948s
Both 64-bit hosts show no swapping, and minimal CPU load. Can anyone point out what I've missed?
Brian Reichert wrote:
I have an experiment, that may have too many moving parts to tease apart.
We're exploring migrating our LDAP server from a 32-bit OS to a 64-bit OS, and I've also explored the port to the mdb back end. My first shot at this does not show the performance the write-ups on MDB would have me expect.
If it matters, my old environment is a Dell R610 server, with 6G of RAM.
# uname -r 2.6.18-308.13.1.el5PAE # cat /etc/redhat-release CentOS release 5.8 (Final) # rpm -qf `which slapd` openldap-servers-2.3.43-25.el5_8.1
My ldapsearch invocations on this host take ~120 seconds.
I set up two Dell R610s, each with 12G or memory.
One environment uses the RPM produced by the LTB project, here named 'ltb':
root@ltb# uname -r 2.6.32-431.17.1.el6.x86_64 root@ltb# cat /etc/redhat-release CentOS release 6.5 (Final) root@ltb# rpm -qf /usr/local/openldap/libexec/slapd openldap-ltb-2.4.39-2.el6.x86_64
And the other uses the same OS, but uses CentOS's stock OpenLDAP RPM; here named 'stock':
root@stock# rpm -qf `which slapd` openldap-servers-2.4.23-34.el6_5.1.x86_64
I migrated a database of ~360k DNs, and underwent the cache tuning, following the suggestions here:
http://www.openldap.org/doc/admin24/tuning.html#Caching http://blog.monitor.us/2012/02/berkeley-db-performance-tuning/
And with those tunings, my ldapsearch command does outperform our old 32-but environment:
stock# time ldapsearch -x -w XXXXX -D "cn=manager,orgid=MyOrgID" -b orgid=MyOrgID dn | grep ^dn: | wc -l 3600071 real 0m27.147s user 0m19.216s sys 0m10.585s
However, my efforts to use the MDB backend yield slower responses, on the order of 175 seconds.
How I set up the MDB backend:
I copied my old slapd.conf over, and replaced my bdb backend with mdb settings.
I set 'tool-threads' to 2
I set 'maxsize' to 21474836480 [ 20G = 20 * ( 1024 * 1024 * 1024 ) ]
ltb# /usr/local/openldap/sbin/slaptest -f /etc/openldap/slapd.conf -F /etc/openldap/slapd.d config file testing succeeded ltb# chown -R ldap:ldap /etc/openldap/slapd.d
I was apparently able to import all 360k DNs without an error:
ltb# time setuidgid ldap /usr/local/openldap/sbin/slapadd \ -q -v -F /etc/openldap/slapd.d -b orgId=MyOrgID \ -l export_from_OLD.ldif >& log; echo $?
real 16m46.924s user 8m52.165s sys 3m44.859s 0
ltb# ls -ld data.mdb -rw------- 1 ldap ldap 21474836480 Jul 30 20:16 data.mdb ltb# du -c -h data.mdb 6.3G data.mdb 6.3G total
But, the same search looks worse here.
ltb# time ldapsearch -x -w XXXXX -D "cn=manager,orgid=MyOrgID" -b orgid=MyOrgID dn | grep ^dn: | wc -l 3600071 real 2m55.482s user 0m25.459s sys 0m23.948s
Both 64-bit hosts show no swapping, and minimal CPU load. Can anyone point out what I've missed?
While that search is running you should see slapd at 100% CPU. If not, then something in your system is throttling your connection.
On Tue, Aug 12, 2014 at 11:12:51AM -0700, Howard Chu wrote:
Brian Reichert wrote:
But, the same search looks worse here.
ltb# time ldapsearch -x -w XXXXX -D "cn=manager,orgid=MyOrgID" -b orgid=MyOrgID dn | grep ^dn: | wc -l 3600071 real 2m55.482s user 0m25.459s sys 0m23.948s
Both 64-bit hosts show no swapping, and minimal CPU load. Can anyone point out what I've missed?
While that search is running you should see slapd at 100% CPU. If not, then something in your system is throttling your connection.
And it is not at 100%. 'top' shows slapd on this host is only at ~50%.
I'll review the 'threads' setting.
Thanks for the feedback...
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
On Tue, Aug 12, 2014 at 02:04:20PM -0400, Brian Reichert wrote:
On Tue, Aug 12, 2014 at 11:12:51AM -0700, Howard Chu wrote:
While that search is running you should see slapd at 100% CPU. If not, then something in your system is throttling your connection.
And it is not at 100%. 'top' shows slapd on this host is only at ~50%.
I'll review the 'threads' setting.
Progress!
The 'ldap' user has no system limits set on it:
# setuidgid ldap sh -c ulimit -a unlimited
I have 2 CPUs with 4 cores each:
# grep "^physical id" /proc/cpuinfo | sort -u | wc -l 2 # grep "^cpu cores" /proc/cpuinfo | sort -u cpu cores : 4
This page recommends:
http://www.openldap.org/doc/admin24/tuning.html#%7B%7Bslapd%7D%7D%288%29%20T...
This value should generally be a function of the number of "real" cores on the system, for example on a server with 2 CPUs with one core each, set this to 8, or 4 threads per real core.
Assuming 'real core' == CPU, in my case, I think this should be 8 (4 * 2 physical CPUs). Is that correct?
It was set to 32 (I have no idea why), and 'top' showed ~50%.
When I changed threads to '8', my query times dropped to ~22 seconds, which is _much_ better than the 175 I was seeing.
'top' still shows slapd only using %50, so I hazard that it keeps to one CPU. Is that a valid assumption?
Could this mdb database perform better? It's outperforming my bdb backend by %25, which isn't too shabby, but I'm curious if this sort of performance increase is typical...
On Tue, Aug 12, 2014 at 03:22:57PM -0400, Brian Reichert wrote:
On Tue, Aug 12, 2014 at 02:04:20PM -0400, Brian Reichert wrote:
On Tue, Aug 12, 2014 at 11:12:51AM -0700, Howard Chu wrote:
While that search is running you should see slapd at 100% CPU. If not, then something in your system is throttling your connection.
And it is not at 100%. 'top' shows slapd on this host is only at ~50%.
I'll review the 'threads' setting.
Progress!
The 'ldap' user has no system limits set on it:
# setuidgid ldap sh -c ulimit -a unlimited
I have 2 CPUs with 4 cores each:
# grep "^physical id" /proc/cpuinfo | sort -u | wc -l 2 # grep "^cpu cores" /proc/cpuinfo | sort -u cpu cores : 4
This page recommends:
http://www.openldap.org/doc/admin24/tuning.html#%7B%7Bslapd%7D%7D%288%29%20T...
This value should generally be a function of the number of "real" cores on the system, for example on a server with 2 CPUs with one core each, set this to 8, or 4 threads per real core.
Assuming 'real core' == CPU, in my case, I think this should be 8 (4 * 2 physical CPUs). Is that correct?
It was set to 32 (I have no idea why), and 'top' showed ~50%.
When I changed threads to '8', my query times dropped to ~22 seconds, which is _much_ better than the 175 I was seeing.
'top' still shows slapd only using %50, so I hazard that it keeps to one CPU. Is that a valid assumption?
Try pressing "1" to have top show individual CPUs.
Could this mdb database perform better? It's outperforming my bdb backend by %25, which isn't too shabby, but I'm curious if this sort of performance increase is typical...
-- Brian Reichert reichert@numachi.com BSD admin/developer at large
On Tue, Aug 12, 2014 at 03:47:10PM -0400, Christopher Wood wrote:
On Tue, Aug 12, 2014 at 03:22:57PM -0400, Brian Reichert wrote:
'top' still shows slapd only using %50, so I hazard that it keeps to one CPU. Is that a valid assumption?
Try pressing "1" to have top show individual CPUs.
Yep, I know that trick. On my host, I see this:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 16278 root 20 0 44764 1888 1488 R 100.0 0.0 0:18.13 ldapsearch 15840 ldap 20 0 20.4g 4.3g 4.3g S 51.1 36.9 14:48.09 slapd
One could argue I should not be running a big query on the same host as the server, but that is how I've been doing all of my recent testing.
Brian Reichert wrote:
On Tue, Aug 12, 2014 at 03:47:10PM -0400, Christopher Wood wrote:
On Tue, Aug 12, 2014 at 03:22:57PM -0400, Brian Reichert wrote:
'top' still shows slapd only using %50, so I hazard that it keeps to one CPU. Is that a valid assumption?
Try pressing "1" to have top show individual CPUs.
Yep, I know that trick. On my host, I see this:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 16278 root 20 0 44764 1888 1488 R 100.0 0.0 0:18.13 ldapsearch 15840 ldap 20 0 20.4g 4.3g 4.3g S 51.1 36.9 14:48.09 slapd
One could argue I should not be running a big query on the same host as the server, but that is how I've been doing all of my recent testing.
If ldapsearch is already running at 100% CPU then it's the limiting factor here so no, you're not going to get any faster. I still find it a bit strange, ldapsearch should still be faster than slapd. Do you have a non-OpenLDAP ldapsearch installed on that machine? The FedoraDS/389DS/RHDS tools are certainly slower, so that could make a difference.
On Tue, Aug 12, 2014 at 06:59:52PM -0700, Howard Chu wrote:
If ldapsearch is already running at 100% CPU then it's the limiting factor here so no, you're not going to get any faster. I still find it a bit strange, ldapsearch should still be faster than slapd. Do you have a non-OpenLDAP ldapsearch installed on that machine? The FedoraDS/389DS/RHDS tools are certainly slower, so that could make a difference.
Good call; I'll review; I was using the ldapsearch from CentOS's RPM, not the one provided by the LTB project's RPM.
I'll report back.
On Wed, Aug 13, 2014 at 11:34:46AM -0400, Brian Reichert wrote:
On Tue, Aug 12, 2014 at 06:59:52PM -0700, Howard Chu wrote:
If ldapsearch is already running at 100% CPU then it's the limiting factor here so no, you're not going to get any faster. I still find it a bit strange, ldapsearch should still be faster than slapd. Do you have a non-OpenLDAP ldapsearch installed on that machine? The FedoraDS/389DS/RHDS tools are certainly slower, so that could make a difference.
Good call; I'll review; I was using the ldapsearch from CentOS's RPM, not the one provided by the LTB project's RPM.
I'll report back.
Ok, just to report:
Using LTB's ldapsearch didn't improve things.
Running an ldapsearch remotely did allow slapd to take up 100% CPU, but the queries didn't complete any quicker. As I introduced network latency, I'm not shocked.
I feel that, for now, this is Good Enough.
There are other attributes of the MDB database I need to explore; resizing, backups, etc. Off I go!
--On Thursday, August 14, 2014 10:41 AM -0400 Brian Reichert reichert@numachi.com wrote:
Ok, just to report:
Using LTB's ldapsearch didn't improve things.
For kicks, you can try a search over ldapi:/// and ignore the network interface entirely.
Also, you may want to look at the writemap flag for mdb.
--Quanah
--
Quanah Gibson-Mount Server Architect Zimbra, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
On Thu, Aug 14, 2014 at 11:15:32AM -0700, Quanah Gibson-Mount wrote:
--On Thursday, August 14, 2014 10:41 AM -0400 Brian Reichert reichert@numachi.com wrote:
Ok, just to report:
Using LTB's ldapsearch didn't improve things.
For kicks, you can try a search over ldapi:/// and ignore the network interface entirely.
Hmm, never messed with that. That knocked my simple queries from ~25 seconds to ~20. Which is a sizable improvement, but I'm not certain that the ldapi interface will be useful for us in production. I'd have to see if JNDI implementations can handle it. Thanks for the suggestion, though!
Also, you may want to look at the writemap flag for mdb.
My simple tests are read-only queries, and that seem like a tricky flag to have in use in production. Or so I infer from the manpage.
--Quanah
--
Quanah Gibson-Mount Server Architect Zimbra, Inc.
Zimbra :: the leader in open source messaging and collaboration
Hi,
On Tue, 12 Aug 2014, Brian Reichert wrote:
On Tue, Aug 12, 2014 at 02:04:20PM -0400, Brian Reichert wrote:
On Tue, Aug 12, 2014 at 11:12:51AM -0700, Howard Chu wrote:
While that search is running you should see slapd at 100% CPU. If not, then something in your system is throttling your connection.
And it is not at 100%. 'top' shows slapd on this host is only at ~50%.
I'll review the 'threads' setting.
Progress!
The 'ldap' user has no system limits set on it:
# setuidgid ldap sh -c ulimit -a unlimited
I have 2 CPUs with 4 cores each:
<snipp/>
have you tried starting numad on centos to work around NUMA cpu architecture issues.
Greetings Christian
On Tue, Aug 12, 2014 at 11:49:57PM +0200, Christian Kratzer wrote:
have you tried starting numad on centos to work around NUMA cpu architecture issues.
I'm not familiar with this; let me do some research.
Greetings Christian
-- Christian Kratzer CK Software GmbH Email: ck@cksoft.de Wildberger Weg 24/2 Phone: +49 7032 893 997 - 0 D-71126 Gaeufelden Fax: +49 7032 893 997 - 9 HRB 245288, Amtsgericht Stuttgart Mobile: +49 171 1947 843 Geschaeftsfuehrer: Christian Kratzer Web: http://www.cksoft.de/
Hi!
If your "top" isn't brand new (tey changed the UI), this is what you could try: Press "1" to show statistics for each logical CPU, press "f j <RETURN>" to display the CPU each process runs on, press "H" to display threads, press "i" to leave out idle processes in display, press " s 0.5 <RETURN>" to refresh the display twice per second (you may increase the rate to any paranoid value you like). The watch what's going on. Have an eys on the wait and system parts of CPU also...
Regards, Ulrich
Brian Reichert reichert@numachi.com schrieb am 12.08.2014 um 21:22 in
Nachricht 20140812192257.GA54292@numachi.com:
On Tue, Aug 12, 2014 at 02:04:20PM -0400, Brian Reichert wrote:
On Tue, Aug 12, 2014 at 11:12:51AM -0700, Howard Chu wrote:
While that search is running you should see slapd at 100% CPU. If not, then
something in your system is throttling your connection.
And it is not at 100%. 'top' shows slapd on this host is only at ~50%.
I'll review the 'threads' setting.
Progress!
The 'ldap' user has no system limits set on it:
# setuidgid ldap sh -c ulimit -a unlimited
I have 2 CPUs with 4 cores each:
# grep "^physical id" /proc/cpuinfo | sort -u | wc -l 2 # grep "^cpu cores" /proc/cpuinfo | sort -u cpu cores : 4
This page recommends:
http://www.openldap.org/doc/admin24/tuning.html#%7B%7Bslapd%7D%7D%288%29%20T...
This value should generally be a function of the number of "real" cores on the system, for example on a server with 2 CPUs with one core each, set this to 8, or 4 threads per real core.
Assuming 'real core' == CPU, in my case, I think this should be 8 (4 * 2 physical CPUs). Is that correct?
It was set to 32 (I have no idea why), and 'top' showed ~50%.
When I changed threads to '8', my query times dropped to ~22 seconds, which is _much_ better than the 175 I was seeing.
'top' still shows slapd only using %50, so I hazard that it keeps to one CPU. Is that a valid assumption?
Could this mdb database perform better? It's outperforming my bdb backend by %25, which isn't too shabby, but I'm curious if this sort of performance increase is typical...
-- Brian Reichert reichert@numachi.com BSD admin/developer at large
openldap-technical@openldap.org