Hi,
I'm trying to set up an openldap proxy server using slapd-meta. Everything worked so far, but after the backend database grew bigger and bigger, searches with a fixed page size started to give strange results.
Tracing the problem led me to the following: If I set a page size of 300 in my ldapsearch (where localhost:3890 is my slapd running slapd-meta): ldapsearch -x -W -D "CN=admin,DC=example,DC=org" -E pr=300 -H "ldap://localhost:3890/" -b "dc=a,dc=example,dc=org" "(objectClass=inetOrgPerson)" uid I get at most 300 results and the paging information is lost on the way. If I set 300 to 1000 I get all of the results, because I have ~500 entries.
Doing an ldapsearch to the backand servers: ldapsearch -x -W -D "CN=admin,DC=example,DC=org" -E pr=300 -H "ldap:// dc1.a.example.org/" "dc=a,dc=example,dc=org" "(objectClass=user)" userPrincipalName I get 300 results and a prompt to press enter, after pressing some enters I can get every entries.
My backends are Active directories... :(
Thanks for your help, Lajos
Config: include /etc/ldap/schema/core.schema include /etc/ldap/schema/cosine.schema include /etc/ldap/schema/ad_attr.schema include /etc/ldap/schema/ad_class.schema
pidfile /var/run/slapd/slapd.pid argsfile /var/run/slapd/slapd.args loglevel 8 modulepath /usr/lib/ldap moduleload back_meta moduleload back_ldap moduleload rwm moduleload pcache moduleload back_bdb sizelimit 1000 tool-threads 1
database meta suffix "dc=example,dc=org" norefs yes rebind-as-user yes chase-referrals no
uri "ldap://dc1.example.org/dc=example,dc=org" "ldap:// dc2.example.org/"
uri "ldap://dc1.a.example.org/dc=a,dc=example,dc=org" "ldap:// dc2.a.example.org/"
uri "ldap://dc1.b.example.org/dc=b,dc=example,dc=org" "ldap:// dc2.b.example.org/"
overlay rwm rwm-rewriteEngine on rwm-map attribute uid userPrincipalName rwm-map objectclass inetOrgPerson user
--On Tuesday, May 05, 2009 1:31 PM +0200 Lajos Boróczki boroczki.lajos@gmail.com wrote:
Hi,
I'm trying to set up an openldap proxy server using slapd-meta. Everything worked so far, but after the backend database grew bigger and bigger, searches with a fixed page size started to give strange results.
What OpenLDAP release are you using?
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
Lajos Boróczki wrote:
Hi,
I'm trying to set up an openldap proxy server using slapd-meta. Everything worked so far, but after the backend database grew bigger and bigger, searches with a fixed page size started to give strange results.
Tracing the problem led me to the following: If I set a page size of 300 in my ldapsearch (where localhost:3890 is my slapd running slapd-meta): ldapsearch -x -W -D "CN=admin,DC=example,DC=org" -E pr=300 -H "ldap://localhost:3890/" -b "dc=a,dc=example,dc=org" "(objectClass=inetOrgPerson)" uid I get at most 300 results and the paging information is lost on the way. If I set 300 to 1000 I get all of the results, because I have ~500 entries.
Doing an ldapsearch to the backand servers: ldapsearch -x -W -D "CN=admin,DC=example,DC=org" -E pr=300 -H "ldap:// dc1.a.example.org/" "dc=a,dc=example,dc=org" "(objectClass=user)" userPrincipalName I get 300 results and a prompt to press enter, after pressing some enters I can get every entries.
My backends are Active directories... :(
Thanks for your help, Lajos
Config: include /etc/ldap/schema/core.schema include /etc/ldap/schema/cosine.schema include /etc/ldap/schema/ad_attr.schema include /etc/ldap/schema/ad_class.schema
pidfile /var/run/slapd/slapd.pid argsfile /var/run/slapd/slapd.args loglevel 8 modulepath /usr/lib/ldap moduleload back_meta moduleload back_ldap moduleload rwm moduleload pcache moduleload back_bdb sizelimit 1000 tool-threads 1
database meta suffix "dc=example,dc=org" norefs yes rebind-as-user yes chase-referrals no
uri "ldap://dc1.example.org/dc=example,dc=org" "ldap:// dc2.example.org/"
uri "ldap://dc1.a.example.org/dc=a,dc=example,dc=org" "ldap:// dc2.a.example.org/"
uri "ldap://dc1.b.example.org/dc=b,dc=example,dc=org" "ldap:// dc2.b.example.org/"
overlay rwm rwm-rewriteEngine on rwm-map attribute uid userPrincipalName rwm-map objectclass inetOrgPerson user
I don't think slapd-meta(5) can handle cross-target paged results. It would require some non-trivial bookkeeping of the control's cookie which <personal opinion> I wouldn't consider worth the effort for such a useless control. </personal opinion>
p.
Sadly, that was my opinion too. But there are some applications without any way to finetune ldap client parameters, so I asked if there is a secret option... :)
Thanks for your help, Lajos
2009/5/6 Pierangelo Masarati masarati@aero.polimi.it
Lajos Boróczki wrote:
Hi,
I'm trying to set up an openldap proxy server using slapd-meta. Everything worked so far, but after the backend database grew bigger and bigger, searches with a fixed page size started to give strange results.
Tracing the problem led me to the following: If I set a page size of 300 in my ldapsearch (where localhost:3890 is my slapd running slapd-meta): ldapsearch -x -W -D "CN=admin,DC=example,DC=org" -E pr=300 -H "ldap://localhost:3890/" -b "dc=a,dc=example,dc=org" "(objectClass=inetOrgPerson)" uid I get at most 300 results and the paging information is lost on the way. If I set 300 to 1000 I get all of the results, because I have ~500 entries.
Doing an ldapsearch to the backand servers: ldapsearch -x -W -D "CN=admin,DC=example,DC=org" -E pr=300 -H "ldap:// dc1.a.example.org/" "dc=a,dc=example,dc=org" "(objectClass=user)" userPrincipalName I get 300 results and a prompt to press enter, after pressing some enters I can get every entries.
My backends are Active directories... :(
Thanks for your help, Lajos
Config: include /etc/ldap/schema/core.schema include /etc/ldap/schema/cosine.schema include /etc/ldap/schema/ad_attr.schema include /etc/ldap/schema/ad_class.schema
pidfile /var/run/slapd/slapd.pid argsfile /var/run/slapd/slapd.args loglevel 8 modulepath /usr/lib/ldap moduleload back_meta moduleload back_ldap moduleload rwm moduleload pcache moduleload back_bdb sizelimit 1000 tool-threads 1
database meta suffix "dc=example,dc=org" norefs yes rebind-as-user yes chase-referrals no
uri "ldap://dc1.example.org/dc=example,dc=org" "ldap:// dc2.example.org/"
uri "ldap://dc1.a.example.org/dc=a,dc=example,dc=org" "ldap:// dc2.a.example.org/"
uri "ldap://dc1.b.example.org/dc=b,dc=example,dc=org" "ldap:// dc2.b.example.org/"
overlay rwm rwm-rewriteEngine on rwm-map attribute uid userPrincipalName rwm-map objectclass inetOrgPerson user
I don't think slapd-meta(5) can handle cross-target paged results. It would require some non-trivial bookkeeping of the control's cookie which
<personal opinion> I wouldn't consider worth the effort for such a useless control. </personal opinion>
p.
On Wed, May 6, 2009 at 5:38 PM, Pierangelo Masarati <masarati@aero.polimi.it
wrote:
I don't think slapd-meta(5) can handle cross-target paged results. It would require some non-trivial bookkeeping of the control's cookie which
<personal opinion> I wouldn't consider worth the effort for such a useless control. </personal opinion>
On the surface i would agree, but there are many situations in which it is impractical / impossible to process search results in a single gulp, without the paged results control.
One case i can think of :
A web interface where you need to present ldap search results to a user via a browser, where practicalities mean you cannot display more than say a few hundred results, either for usability, speed or browser page-size limitations (think maybe many thousands / millions of records, showing say 100 at a time).
In this situation you cannot store every complete search result for every single user, in memory or in each user's session cookie, for many practical / scalability / efficiently reasons, more particularly because the user may have what they want on the second page, so the rest of the unused search result sits in memory or session until it expires, the user having navigated away from the search results page and perhaps on to greener pastures (ui - wise).
similarly for very large databases, where the search results may exceed the physical memory of the server trying to return search results, so some part of your data set would always be inaccessible without the paged result control to load them in chunks that the ldap client can deal with.
Paged results may never be needed for small data sets, but start getting to the 100K or 1M entries and you would grind to a halt pretty fast... even 10K gets awkward 100 entries at a time :)
The lack of paged results on slapd-meta probably diminishes it's usefulness for large data sets. I would also think that users of slapd-meta would typically be at least medium sized users, who are trying to tack together several non-trivial sized data sets, so there is some probability they will stumble over this eventually.
Yes i know you added a personal opinion disclaimer, sorry about that.. but "useless" is indeed a matter of opinion, those who need it will REALLY need it. The users of large data sets must just {don't,cant,wont} be able to use back-meta :)
Cheers Brett
Brett @Google wrote:
On Wed, May 6, 2009 at 5:38 PM, Pierangelo Masarati <masarati@aero.polimi.it
wrote:
I don't think slapd-meta(5) can handle cross-target paged results. It would require some non-trivial bookkeeping of the control's cookie which
<personal opinion> I wouldn't consider worth the effort for such a useless control. </personal opinion>
On the surface i would agree, but there are many situations in which it is impractical / impossible to process search results in a single gulp, without the paged results control.
One case i can think of :
A web interface where you need to present ldap search results to a user via a browser, where practicalities mean you cannot display more than say a few hundred results, either for usability, speed or browser page-size limitations (think maybe many thousands / millions of records, showing say
millions of records with a browser? are you kidding? If my browser can only inspect 500 entries at a time, I'd just put a client-side sizelimit of 500 on the search request (the server should do the same on the other side).
100 at a time).
In this situation you cannot store every complete search result for every single user, in memory or in each user's session cookie, for many practical / scalability / efficiently reasons, more particularly because the user may have what they want on the second page, so the rest of the unused search result sits in memory or session until it expires, the user having navigated away from the search results page and perhaps on to greener pastures (ui - wise).
similarly for very large databases, where the search results may exceed the physical memory of the server trying to return search results, so some part of your data set would always be inaccessible without the paged result control to load them in chunks that the ldap client can deal with.
Paged results may never be needed for small data sets, but start getting to the 100K or 1M entries and you would grind to a halt pretty fast... even 10K gets awkward 100 entries at a time :)
The lack of paged results on slapd-meta probably diminishes it's usefulness for large data sets. I would also think that users of slapd-meta would typically be at least medium sized users, who are trying to tack together several non-trivial sized data sets, so there is some probability they will stumble over this eventually.
Yes i know you added a personal opinion disclaimer, sorry about that.. but "useless" is indeed a matter of opinion, those who need it will REALLY need it. The users of large data sets must just {don't,cant,wont} be able to use back-meta :)
Users of back-meta usually need it to retrieve a single entry without the need to care about where it's located.
The only way to implement paged results in back-meta would be to intercept any request for it, chain requests to remote servers without the control, and page results locally at the proxy.
Users of paged results usually use it without requesting it, only because AD returns large data sets with that control in place, without it being requested by the client, so it wouldn't fit into this schema anyway.
In any case, adding paged results the way it's specified would not be too difficult. Probably it would be rather inefficient, though and, IMHO, not worth the effort. It could be done putting an overlay on top of back-meta that lets it run the request without the control, caches the results and deals with responding to subsequent requests (horrible, I know).
What would likely be impossible is to deal with unsolicited paged results response in any way other than bailing out.
p.
openldap-technical@openldap.org