On Tue, Jan 22, 2019 at 05:49:12PM +0000, h.b.furuseth@usit.uio.no wrote:
On 1/21/19 7:49 PM, Ondřej Kuzník wrote:
Except there are no locks as you know being the author of parts of code that deals with what I'm about to outline anyway:
Whenever a cn=config op is about to be processed, a pause is requested. That tells worker threads to stop picking up new work and waits until they're all quiet. The indexing task is run by one of these worker threads.
Duuh, right. I got stuck looking for what's special about the indexing task and couldn't find it:-( I need to make it special.
So, let tasks declare their expected speed until finish or between pausechecks. At FAST=1 (default) and SLOW=0. A pause only stops tasks with speed < ltp_pause. In thread_pool_pause(), replace the WANT_PAUSE stage with
while (++ltp_pause <= max speed) { wait until no more tasks with speed < ltp_pause; }
Then fast tasks should breeze past slow ones when preparing to pause. Until all threads have slow tasks, anyway.
To mitigate that, we'd need to predeclare the speed when submitting a task, and limit the number of parallel slow tasks. pool_submit() could stash the rest in a "slow queue" instead of submitting. But I don't want to go there yet.
The problem is not that it's still scheduled when the pause is requested, scheduled tasks don't prevent a pause (and I think some of them need to be a bit more aware of that fact), but that it's been picked up by a worker thread and is running while cn=config waits for them to pause.
What I was suggesting was that it could check every so often whether it might be holding up a pause and join it just like some other tasks do. But that's only a good idea if it's able to handle the changes that happen during the pause (indexes reconfigured again, the database going away, ...) as well as being able to release any locks temporarily.