https://bugs.openldap.org/show_bug.cgi?id=9983
Issue ID: 9983 Summary: operation_unlink files the operation before it is fully unlinked Product: OpenLDAP Version: 2.5.13 Hardware: All OS: All Status: UNCONFIRMED Keywords: needs_review Severity: normal Priority: --- Component: lloadd Assignee: bugs@openldap.org Reporter: ondra@mistotebe.net Target Milestone: ---
In epoch based memory management, objects should only be submitted for reclaiming when no actors can reach them unless they've done that before the submission happened.
This is broken in operation_unlink(): It calls try_release_ref() at the beginning where the operation is added to the to-reclaim list, only then it proceeds to unlink it from other objects.
The following sequence is then possible: - current_epoch == 1 (no threads are alive in epoch == 0) - in thread 1 (epoch = 1), try_release_ref() marks the object to be reclaimed in current_epoch - thread 2 activates and current_epoch is incremented (current_epoch == 2) - thread 2 handles an Unbind for the operation's client and reaches client_reset() (epoch == 2) - thread 2 (client_reset) snapshots and clears client->c_ops (among other things, c->c_ops links to our object) - thread 1 finishes operation_unlink, deactivates and there are no more threads in epoch == 1 - thread 3 activates and current_epoch is incremented (current_epoch == 3), there are objects in epoch == 1, namely the object above which is now destroyed (freed) - thread 2 wakes up again and tries to call operation_abandon on the above object, this accesses memory freed
epoch_append (and try_release_ref) should only be called when unlinking has finished. I'm testing a patch right now.