back-mdb status
by Howard Chu
A bit of a summary of how the backend is shaping up. I've been testing with a
variety of synthetic LDIFs as well as an actual application database (Zimbra
accounts).
I noted before that back-mdb's write speeds on disk are quite slow. This is
because a lot of its writes will be to random disk pages, and also the data
writes in a transaction commit are followed by a meta page write, which always
involves a seek to page 0 or page 1 of the DB file. For slapadd -q this effect
can be somewhat hidden because the writes are done with MDB_NOSYNC specified,
so no explicit flushes are performed. In my current tests with synchronous
writes, back-mdb is one half the speed of back-bdb/hdb.
(Even in fully synchronous mode, BDB only writes its transaction logs
synchronously, and those are always sequential writes so there's no seek
overhead to deal with.)
With that said, slapadd -q for a 3.2M entry database on a tmpfs:
back-hdb: real 75m32.678s user 84m31.733s sys 1m0.316s
back-mdb: real 63m51.048s user 50m23.125s sys 13m27.958s
For back-hdb, BDB was configured with a 32GB environment cache. The resulting
DB directory consumed 14951004KB including data files and environment files.
For back-mdb, MDB was configured with a 32GB mapsize. The resulting DB
directory consumed 18299832KB. The input LDIF was 2.7GB, and there were 29
attributes indexed. Currently MDB is somewhat wasteful with space when dealing
with the sorted-duplicate databases that are used for indexing, there's
definitely room for improvement here.
Also this slapadd was done with tool-threads set to 1, because back-mdb only
allows one writer at a time anyway. There is also obviously room for
improvement here, in terms of a bulk-loading API for the MDB library.
With the DB loaded, the time to execute a search that scans every entry in the
DB was performed against each server.
Initially back-hdb was only configured with a cachesize of 10000 and
IDLcachesize of 10000. It was tested again using a cachesize of 5,000,000
(which is more than was needed since the DB only contained 3,200,100 entries).
In each configuration a search was performed twice - once to measure the time
to go from an empty cache to a fully primed cache, and again to measure the
time for the fully cached search.
first second slapd size
back-hdb, 10K cache 3m6.906s 1m39.835s 7.3GB
back-hdb, 5M cache 3m12.596s 0m10.984s 46.8GB
back-mdb 0m19.420s 0m16.625s 7.0GB
Next, the time to execute multiple instances of this search was measured,
using 2, 4, 8, and 16 ldapsearch instances running concurrently.
average result time
2 4 8 16
back-hdb, 5M 0m14.147s 0m17.384s 0m45.665s 17m15.114s
back-mdb 0m16.701s 0m16.688s 0m16.621 0m16.955s
I don't recall doing this test against back-hdb on ada.openldap.org before,
certainly the total blowup at 16 searches was unexpected. But as you can see,
with no read locks in back-mdb, search performance is pretty much independent
of load. At 16 threads back-mdb slowed down measurably, but that's
understandable given that the rest of the system still needed CPU cycles here
and there. Otherwise, slapd was running at 1600% CPU the entire time. For
back-hdb, slapd maxed out at 1467% CPU, the lock overhead drove it into the
ground.
So far I'm pretty pleased with the results; for the most part back-mdb is
delivering on what I expected. Decoding each entry every time is a bit of a
slowdown, compared to having entries fully cached. But the cost disappears as
soon as you get more than a couple requests running at once.
Overall I believe it proves the basic philosophy - in this day and age, it's a
waste of application developers' time to incorporate a caching layer into
their own code. The OS already does it and does it well. Give yourself as
transparent a path as possible between RAM and disk using mmap, and don't fuss
with it any further.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
11 years, 7 months
Overlay Mod - AccessLog + back-ldap
by Jim Finn
I am in the process extending the capability of the accesslog overlay
to include logging of the client IP address of each request/op. I have
successfully implemented this on 2.3.42. The next step is to optionally
record the IP address of the server fielding the request. I'm
particularly interested which server back-ldap is using to respond to
the request.
The closest I have come to finding the back-end server is li->li_url but
this is just the uri list as config'd in slapd.conf. I'm hoping to find
some assistance in determining the back-end server.
Here is the changes that have been made to
2.3.43/servers/slapd/overlays/accesslog.c:
###BEGIN DIFF###
35d34
< #include "../back-ldap/back-ldap.h"
167c166
< *ad_reqReferral, *ad_reqOld, *ad_reqClient, *ad_reqServer;
---
> *ad_reqReferral, *ad_reqOld;
321,328d319
< { "( " LOG_SCHEMA_AT ".30 NAME 'reqClient' "
< "DESC 'Client Source Address' "
< "SYNTAX OMsDirectoryString "
< "SINGLE-VALUE )", &ad_reqClient },
< { "( " LOG_SCHEMA_AT ".31 NAME 'reqServer' "
< "DESC 'Destination Server Address' "
< "SYNTAX OMsDirectoryString "
< "SINGLE-VALUE )", &ad_reqServer },
345c336
< "reqResult $ reqMessage $ reqReferral $ reqClient $
reqServer) )",
---
> "reqResult $ reqMessage $ reqReferral ) )",
839,863d829
< //jfinn(a)searshc.com: Log Client IP Address/Hostname/URI to "reqClient"
< // attribute.
<
< BerValue clientIP;
< clientIP.bv_val = malloc(255);
< strcpy(clientIP.bv_val,op->o_hdr->oh_conn->c_peer_name.bv_val);
< strtok(clientIP.bv_val, "="); // use strtok to sanitize
the string
< clientIP.bv_val = strtok (NULL, ":"); // IP=x.x.x.x:XXXX ---> x.x.x.x
< clientIP.bv_len = strlen (clientIP.bv_val);
< attr_merge_one( e, ad_reqClient, &clientIP, NULL );
<
< // end Client IP address logging
<
< //jfinn(a)searshc.com: Log Server IP Address/Hostname/URI to "reqServer"
< // attribute for use with back-ldap backend.
< ldapinfo_t *ldap_info = (ldapinfo_t *)op->o_bd->be_private;
< BerValue serverIP;
< serverIP.bv_val = malloc(255);
< // ldap_info->li_uri is just the URI string as config'd in slapd.conf.
< serverIP.bv_val = ldap_info->li_uri;
< serverIP.bv_len = strlen(serverIP.bv_val);
< attr_merge_one( e, ad_reqServer, &serverIP, NULL );
<
< // end Server IP address logging
<
###END DIFF###
Thanks in advance for your assistance!
Jim Finn
This message, including any attachments, is the property of Sears Holdings Corporation and/or one of its subsidiaries. It is confidential and may contain proprietary or legally privileged information. If you are not the intended recipient, please delete it without reading the contents. Thank you.
11 years, 8 months
Plans for 2.4.27?
by Michael Ströder
HI!
Are there plans for a 2.4.27 release with the recent syncrepl fixes?
Ciao, Michael.
11 years, 8 months
Re: ldap_get_value returns corrupt pointer
by kyle king
On 09/20/2011 06:21 PM, kyle king wrote:
> Under heavy threaded use ldap_get_value returns a corrupted memory
> pointer, not a char pointer pointer that was specified.
I have narrowed the corruption problem down to the file
libraries/libldap/get_values.c apoximatly line 80 the ber_scanf in the
if statement.
When it work on my machine, it returns pointers for 'vals' such as:
vals pointer: 0x105c920
and when it fails:
vals pointer 0x7f6130004c20
Kyle A. King
Quentus Technologies, INC
Cell Phone: 703-635-9512
Work Phone: 253-218-6030
Email: kyle.king(a)quentustech.com
11 years, 8 months
Re: ldap_get_value returns corrupt pointer
by kyle king
I found out I was using a deprecated function then new one
> ldap_get_value_len deals with this issue.
--
Kyle A. King
Quentus Technologies, INC
Cell Phone: 703-635-9512
Work Phone: 253-218-6030
Email: kyle.king(a)quentustech.com
11 years, 8 months
Re: openldap.git branch mdb.master updated. 0ab841598ffb490f4246f892248f0b409e411cc1
by Howard Chu
> - Log -----------------------------------------------------------------
> commit 0ab841598ffb490f4246f892248f0b409e411cc1
> Author: Howard Chu<hyc(a)symas.com>
> Date: Sun Sep 18 16:39:18 2011 -0700
>
> Fix 09006ccec7928c9cf53bca6abe741e8d4d466c98
>
> Check for stale DBs was in the wrong place.
Since back-mdb does no caching, it's OK to run slapadd while slapd is running;
slapd will see the new data immediately. In fact you can run multiple slapds
on the same database and they will stay perfectly in sync. I'm not sure that
that's actually a useful thing to do, but you can do it if you want...
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
11 years, 8 months
back-sql maintenance
by Shawn A. Wilson
I have been using OpenLDAP on Linux for a number of years and am now
evaluating a transition towards the back-sql backend but I understand this
is no longer actively maintained. Would somebody on this list be so kind
as to go into some of the reasoning behind this lack of development? Is
there now a better alternative for LDAP integration with an existing SQL
schema? Is the methodology of back-sql seriously flawed? Is it just a lack
of volunteers? I have encountered a possible bug and, having the
resources, would like to help maintain this SQL backend if that is the
appropriate thing to do.
Thanks,
-shawn
11 years, 8 months
Is JDBC-LDAP still maintained?
by Alan Evans
I am trying to find if anyone still actively maintains JDBC-LDAP. I
have encountered a problem and I have tracked it back to what I think
is the problem but I am not a java developer so I cannot say for
certain.
I posted to the list at MyVirtualDirectory some time ago since it was
their binary distribution I was using, but I never received a
response. Does the OpenLDAP community still maintain the JDBC-LDAP
driver or does the community serve as an archival place? It looks
like the last commits to the git repo were years ago.
Regards,
-Alan
11 years, 8 months
Re: openldap.git branch mdb.master updated. 0533f80364810fb89b1feac892f9397fffe6ebd0
by Howard Chu
> - Log -----------------------------------------------------------------
> commit 0533f80364810fb89b1feac892f9397fffe6ebd0
> Author: Howard Chu<hyc(a)symas.com>
> Date: Wed Sep 14 11:31:27 2011 -0700
>
> Add MacOSX support
>
> mmap() with FIXEDMAP fails, otherwise things work.
Interestingly enough, it succeeds under gdb. At a guess, address space layout
randomization prevents it from working in the normal case. I haven't bothered
to dig into this further; as far as I'm concerned MacOSX is a broken OS and
isn't worth the time. (google "macosx process shared mutexes" and follow the
trail thru semaphores and all the other brokenness if you feel like wasting a
few hours of your life as I just did.)
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
11 years, 8 months