Revisiting an old thread...
I'm definitely seeing our current listener running out of steam on servers
with more than 12 or so cores, and some work in this direction will definitely
help. First it would be a good idea to classify all of the tasks that the
current listener manages, before deciding how to divide them among multiple
The listener is responsible for many events right now:
idle timeout checks
write timeout checks
listener socket events
client socket read events
client socket write events
Splitting the client socket handling across multiple threads will bring the
greatest improvement in scalability. Just need to check our thinking and make
sure the remaining division of labor still makes sense.
There are two cases that annoy me in our current design - why don't we just
dedicate a thread to each listener socket, and let it block on accept() ? That
would eliminate a bit of churn in the current select() workload.
Likewise, why don't we just let writer threads block in write(), instead of
having them ask the listener to listen for writability on their socket? Or, if
we're using non-blocking sockets, why don't we let the writer threads block in
their own select call, instead of relying on the central thread to do the
select and re-dispatch?
The first, obvious answer is this: when threads are blocked in system calls
like accept(), we can't simply wake them up again for shutdown events or other
situations. I believe the obvious fix here is to use select() in each thread,
waiting for both the target fd and the wake_sds fd which is written to
whenever a signal is caught. Off the top of my head I'm not sure, when several
threads are selecting on the same fd, if they all receive a Readable event or
if only one of them will. Anyone know?
I don't think the remaining tasks really involve much overhead. So ideally, we
can handle the idle/write timeout and runqueue scheduling in a single thread,
which can also be responsible for the main signal/shutdown processing.
The listener sockets can each have their own dedicated thread, selecting on
their listener socket and the signal fd.
A small number of threads can handle the bulk of the client socket events. We
would continue to use a power-of-two descriptor table, and a power-of-two set
of threads here. The division of labor would simply be
thread# = FD % number of threads
(There's no need to split the current connection table into multiple arrays,
we just divvy things up so that a given thread only accesses its own slots in
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/