back-ldap connection caching

List overview All Threads
Download

newer

older

Re: #15646 BerkeleyDB hybrid...

#15646 BerkeleyDB hybrid mutexes

Howard Chu

9 Oct 2007 9 Oct '07

5:01 a.m.

After running SLAMD against back-ldap I've noticed some problems in the approach - while a single load generator may send multiple requests over a single connection, back-ldap always creates new connections for each incoming Simple Bind, and leaves them available to be shared by other sessions.

Thinking about it, this usage doesn't really make a lot of sense. Any identity that's explicitly binding to back-ldap is necessarily going to be different from any other session's ID. The only sessions that it makes sense to share are those that were implicitly bound because they were authenticated elsewhere, and fell into this backend (via glue, typically) while processing some other request.

So I think this means we should separate out the explicitly bound connections from everything else. They should only live as long as their inbound slapd connection lives, and should only be used by ops from their inbound slapd connection.

-- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Show replies by date

Pierangelo Masarati

9 Oct 9 Oct

5:59 a.m.

...

After running SLAMD against back-ldap I've noticed some problems in the approach - while a single load generator may send multiple requests over a single connection, back-ldap always creates new connections for each incoming Simple Bind, and leaves them available to be shared by other sessions.

We cured this by forcing back-ldap to always use idassert: this way, binds are done with dedicated (serialized) "privileged" connections pool, and the rest goes into the usual privileged connections pool.

...

Thinking about it, this usage doesn't really make a lot of sense. Any identity that's explicitly binding to back-ldap is necessarily going to be different from any other session's ID. The only sessions that it makes sense to share are those that were implicitly bound because they were authenticated elsewhere, and fell into this backend (via glue, typically) while processing some other request.

So I think this means we should separate out the explicitly bound connections from everything else. They should only live as long as their inbound slapd connection lives, and should only be used by ops from their inbound slapd connection.

I think that's how it is right now: implicit binds go into the lists of privileged connections, while the AVL holds only connections resulting from explicitly bound requests. What's treated separately right now, and needs to be so, is connections for explicit binds: they shouldn't get into the AVL at all until the bind succeeds (see ITS#5154 wrt/ back-meta).

One thing that probably should default to "on" is single-conn: this feature forces back-ldap to uncache connections when rebinding. In fact, the usual behavior only makes sense when a client plans to repeatedly bind on one connection with different identities, and do something with those identities. In this case, if the client at some point needs to re-use an identity that was used earlier, the connection will already be available. With single-conn on, as soon as a client rebinds on an existing connection, the old one is removed.

A totally different approach, but probably not worth except when the number of identities is guaranteed to be small, consists in caching connections based on the identity only. In that case, multiple clients binding with the same connection could re-use the same connection. This approach could be used by extending the concept of "privileged connection" to a set of limited, well-known privileged users.

Ing. Pierangelo Masarati OpenLDAP Core Team

SysNet s.r.l. via Dossi, 8 - 27100 Pavia - ITALIA http://www.sys-net.it --------------------------------------- Office: +39 02 23998309 Mobile: +39 333 4963172 Email: pierangelo.masarati@sys-net.it ---------------------------------------

Howard Chu

5:38 p.m.

Pierangelo Masarati wrote:

...

...
After running SLAMD against back-ldap I've noticed some problems in the approach - while a single load generator may send multiple requests over a single connection, back-ldap always creates new connections for each incoming Simple Bind, and leaves them available to be shared by other sessions.

We cured this by forcing back-ldap to always use idassert: this way, binds are done with dedicated (serialized) "privileged" connections pool, and the rest goes into the usual privileged connections pool.

Sure, but we can avoid some of this serialization.

...

I think that's how it is right now: implicit binds go into the lists of privileged connections, while the AVL holds only connections resulting from explicitly bound requests. What's treated separately right now, and needs to be so, is connections for explicit binds: they shouldn't get into the AVL at all until the bind succeeds (see ITS#5154 wrt/ back-meta).

Hm, if privileged connections are always pooled separately and never in the AVL, then we can just get rid of the DN comparison portion of the AVL lookups.

Another thing we could do to simplify this management is replace the AVL with an array of pointers, and use the conn_idx to check immediately for an explicitly bound connection. (The conn_idx was added for the benefit of the ppolicy overlay, but we ought to have used it here as well.)

...

One thing that probably should default to "on" is single-conn: this feature forces back-ldap to uncache connections when rebinding. In fact, the usual behavior only makes sense when a client plans to repeatedly bind on one connection with different identities, and do something with those identities. In this case, if the client at some point needs to re-use an identity that was used earlier, the connection will already be available. With single-conn on, as soon as a client rebinds on an existing connection, the old one is removed.

If the client is issuing Bind requests anyway, there's no need to keep the old identities around. It should just keep re-using the same connection over and over.

...

A totally different approach, but probably not worth except when the number of identities is guaranteed to be small, consists in caching connections based on the identity only. In that case, multiple clients binding with the same connection could re-use the same connection. This approach could be used by extending the concept of "privileged connection" to a set of limited, well-known privileged users.

Right. In the current case, where a small number of clients are binding to a large number of different identities, it wouldn't be any benefit.

-- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Pierangelo Masarati

10 Oct 10 Oct

2:49 a.m.

Howard Chu wrote:

...

Pierangelo Masarati wrote:

...
...
After running SLAMD against back-ldap I've noticed some problems in the approach - while a single load generator may send multiple requests over a single connection, back-ldap always creates new connections for each incoming Simple Bind, and leaves them available to be shared by other sessions.

We cured this by forcing back-ldap to always use idassert: this way, binds are done with dedicated (serialized) "privileged" connections pool, and the rest goes into the usual privileged connections pool.

Sure, but we can avoid some of this serialization.

This serialization only occurs if you force all operation to be idassert'ed; it was required by the need to proxy a very large database (>20,000,000 users), otherwise both the proxy and the remote server would have quickly run out of descriptors if connections had to kept around. This way, we could pool connections dedicated to binds (serialized) and to regular auth'ed operations with idassert (multiplexed) or anonymous (again multiplexed). That was the wisest use of resources we could figure out without rewriting back-ldap from scratch (well, it was heavily modified, indeed).

...

...
I think that's how it is right now: implicit binds go into the lists of privileged connections, while the AVL holds only connections resulting from explicitly bound requests. What's treated separately right now, and needs to be so, is connections for explicit binds: they shouldn't get into the AVL at all until the bind succeeds (see ITS#5154 wrt/ back-meta).

Hm, if privileged connections are always pooled separately and never in the AVL, then we can just get rid of the DN comparison portion of the AVL lookups.

Right.

...

Another thing we could do to simplify this management is replace the AVL with an array of pointers, and use the conn_idx to check immediately for an explicitly bound connection. (The conn_idx was added for the benefit of the ppolicy overlay, but we ought to have used it here as well.)

Right.

...

...
One thing that probably should default to "on" is single-conn: this feature forces back-ldap to uncache connections when rebinding. In fact, the usual behavior only makes sense when a client plans to repeatedly bind on one connection with different identities, and do something with those identities. In this case, if the client at some point needs to re-use an identity that was used earlier, the connection will already be available. With single-conn on, as soon as a client rebinds on an existing connection, the old one is removed.

If the client is issuing Bind requests anyway, there's no need to keep the old identities around. It should just keep re-using the same connection over and over.

Right. Just make single-conn true by default, or even remove the code that conditions it (the default right now consists in doing nothing).

...

...
A totally different approach, but probably not worth except when the number of identities is guaranteed to be small, consists in caching connections based on the identity only. In that case, multiple clients binding with the same connection could re-use the same connection. This approach could be used by extending the concept of "privileged connection" to a set of limited, well-known privileged users.

Right. In the current case, where a small number of clients are binding to a large number of different identities, it wouldn't be any benefit.

In that case, I'd still stick with idassert'ing all operations with generic, non-anonymous identity, performing binds with a dedicated pool of identities, and eventually using pooled connections for well-known, privileged identities.

Ing. Pierangelo Masarati OpenLDAP Core Team

6477

Age (days ago)

6478

Last active (days ago)

openldap-devel@openldap.org

3 comments

2 participants

tags (0)

participants (2)

Howard Chu
Pierangelo Masarati