Re: (ITS#6276) paused pool can deadlock if writers are waiting

26 Aug 2009


      Hallvard B Furuseth wrote:
...
This is a quick abstract answer, I haven't dived into the code to see
what my suggestions would mean in practice.  Anyway:
...
It sounds to me like this problem cannot be fully prevented when network
operations share the same pool queue as everything else, and we have
operations that can wait for other operations.  All pool threads can be
occupied with LDAP-level operations, so that network-level operations
which the LDAP-level operations depend on can get blocked.  If slapd has
a design which even tries to guarantee forward progress, I'm not aware
of it.
So LDAP-level operations ought to leave at least one thread free for for
network-level operations.  Not necessary a designated thread: A thread
moving from network-level to LDAP-level operation like
connection_read_thread() does could first check that at least one other
thread remains available for network-level operation.
Yes...
...
Beyond that, my immediate reaction was that pauses are implemented at
the wrong level and/or need to be split up in different types of pauses.
There is no good reason cn=config's need to have slapd config variables
for itself to affect network operations - nor, I hope, affect pool-level
operations like pool_purgekey().
I suppose implementing things this way can appear to be too blunt. But the 
amount of locks required to do it at a finer level is, IMO, unmanageable. The 
fact that you can change global slapd config parameters (such as the number of 
threads, size of sockbuf buffers, etc.) makes it inherently safe for 
*anything* else to be active while config changes are being made. And sifting 
thru each variable to decide how sensitive they are, and when they are unsafe, 
will inevitably lead to requiring locks on every single piece of configuration 
data. (Want to look up an attribute type, or objectclass? Or a database 
suffix? etc. etc. etc... There are countless things we do arbitrarily that 
simply won't work if we allow arbitrary threads to continue to run while their 
configurations are changed from under them.)
...
E.g. cn=config could use a slapd_config_pause() call to ask for lone
access to the config variables instead of thread_pool_pause() call.
Slapd operations that must respect such pauses should thus call a
slapd_config_pausecheck() macro/function, not thread_pool_pausecheck()
or depend on the pool to respect slapd-level pauses.
connection_read_thread() can then go ahead and use the connection
independently of those pauses.  If it reaches connection_operation(),
it'll need to check for slapd pause.
Yes, I've thought about this approach too. That makes the pause mechanisms 
even messier.
Another possibility is to just try to read 1 byte in the listener thread, to 
detect the hangup there when we have no other means to discover it. We would 
have to make sure to be able to unget this byte back to the bottom of the 
sockbuf stack if there's valid data. This will affect the throughput of the 
listener thread, but it may not be too terrible a hit.
-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: (ITS#6276) paused pool can deadlock if writers are waiting