On Fri, Dec 30, 2016 at 02:41:06PM -0800, Quanah Gibson-Mount wrote:
Well, it seems to be some sort of race condition.
Yes, I'd agree; probably also load dependent as I never triggered it on my dev systems which are mostly idle other than my test load. It only showed up on my prod systems which tend to have continuous load from various other things.
I did want to confirm that you see this on servers that are long running (I.e., they've been running for a long time, and had other group deletes that went through w/o issue during that time). If so, then I can modify the test to randomly add and delete groups as a part of the test, increasing the likelyhood of triggering the issue within the test.
I don't have too many deletions of group objects themselves in production, mostly just deletions of the members of groups. I didn't see any issues with group deletions in dev, or during some basic initial testing in prod. I'll go ahead and make a new test group, add some members to it, and then delete it and see what happens now that I've been running this code for about 3 weeks...
I didn't see any errors deleting a group, although there were these syncrepl messages that I don't believe used to show up:
Dec 30 21:23:29 themis slapd[2607]: syncrepl_message_to_op: rid=001 be_delete uid=ldaptest5,ou=group,dc=cpp,dc=edu (32) Dec 30 21:23:29 themis slapd[2607]: syncrepl_message_to_op: rid=003 be_delete uid=ldaptest5,ou=group,dc=cpp,dc=edu (32) Dec 30 21:23:29 themis slapd[2607]: syncrepl_message_to_op: rid=002 be_delete uid=ldaptest5,ou=group,dc=cpp,dc=edu (32)
The group and memberOf attributes are gone on all four servers, so other than noise in the logs I'm not sure what these messages meant.