Currently we have a single select(or epoll) loop in daemon.c that lists for all readable and writable sockets, then passes events off to the thread pool for processing.
We listen for writable sockets if a write attempt returns incomplete. There's a pair of mutexes and condition variables used to synch up here between the writing threads and the listener thread. It's quite a lot of lock overhead. As far as I can tell the main reason we do this is so that we can stop a writer thread on demand instead of having it just block forever in write().
We could make the listener's job a lot easier if we only have it listen for readable sockets, and make each writer thread do its own poll. It would need to poll on two descriptors - the one it's waiting to write on, and a pipe used by the listener to terminate the poll. That pipe could be signalled by e.g. the listener writing a byte to it; all writer threads could poll for its read status. (One question here - if multiple threads are polling the same descriptor, do they all receive the wakeup event?)