Am Sat, 26 Mar 2011 17:29:25 +1100
schrieb Ian Willis <Ian(a)checksum.net.au>:
I'm involved with a project where we're exporting information from a
couple of bespoke systems for analytics purposes however there is a
degree of sensitivity associated with some of the information.
The developers of the analytics system have requested 3 API associated
which they would like to have for accessing objects which are as
follows. The first two API's would be interactive.
1 Input (user_id, object) output bool whether a user can access the
2 Input (user_id, (set of objects) output (set of objects a user can
3 Input (user_id) output (the set of objects a user can access) or
possibly the set that a user can't access. (This API can be a couple
of orders of magnitude slower than the previous two as it will only be
provided as a response once per session.)
We have been considering a URL/URI formal for the object
The object will be associated with either a set of users or groups and
we're leaning towards groups. Initially there may only be two groups
sensitive and general however over time a finer grained model has some
longer term business appeal.
Rather than create yet another bit of bespoke infrastructure I was
considering recommending that the API is implemented using LDAP.
The kicker is that there could be up to 2 or 3 hundred million objects
split across 4000 or so users broken up into 500 groups. Most of these
users currently exist in a Oracle backend for a bespoke app and most
of the objects will be exported as rows from the database so
essentially a row is an object.
The shop is a mixed windows unix/linux shop where the directory is
currently AD. The internal developers are primarily MS and one has
already suggested putting the objects into AD however I've said that
this would probably hose operational systems and wouldn't be such a
ADAM will be the next suggestion from the MS side and if that occurs I
will be suggesting that openldap should be thrown into the mix for a
bakeoff on the grounds that the best performing system should provide
the service. If this comes to pass I will also be making sure that
load times are in scope as well.
Based upon the information above
1 Is this completely insane.
Whether this is insane, you have to decide yourself. There are only few
LDAP Directories that can manage such amount of entries, openLDAP can
handle this, although the maximum number of entries I have ever managed
where in the range of 100 million. You might consider to split the tree
into 2 or 3 partitions.
With regard to the so called API's, it would probably better to define
proper search filters and access control.
Depending on the number of connections, the bottleneck probably would
2 Any thoughts or suggestions as to the appropriate schemas for
implementing this, database backends ie BDB, HDB, malloc replacement
etc. LDIF scripts for loading object would be welcome :)
In this early stage it it almost impossible to discuss schema and such.
On the bright side I do have a 16core sun 4600 at home for doing
initial feasibility work with 32G of RAM. In production I would expect
that the limits on memory would be driven by response times and the
amount of memory that AMD/intel hardware can support. I would expect
that commercial support would need to occur as well however everything
is too embryonic at present to push that barrow.
If you know the average size of an entry you can calculate the required
disk and RAM space quite easy.
My background is from the unix/linux open source area however I
haven't really played enough with openldap to be confident of any
concepts that I have the opportunity to push.
Well, than it is time to get acquainted with openLDAP :-)
Dieter Klünter | Systemberatung
GPG Key ID:DA147B05