So slapd crashed two days in a row, yesterday morning and this morning
:(. Same backtrace. What's worse whereas unlike today and every time
before it was relatively harmless, something *weird* happened yesterday.
I did my usual of just restarting slapd on the master and letting the
load balancer fail back over to it and having it catch back up on
replication from the changes made to the secondary, which has always
been fine before other than maybe having to tune up whatever change was
in the middle of being committed when the primary actually died.
But somehow yesterday things that happened like a week and a half ago
just mysteriously disappeared 8-/. 10 groups that were created on 3/22
were just *gone* without a trace. I had to go dig up ldif backups from
then and manually readd them back to the directory to clean stuff up. It
looks like there's some group membership corruption too, I'm going to
look at it in more detail tomorrow, it should be identical to our AD
environment so I can compare and clean up against that. It doesn't look
like any users fell into limbo, I diff'd the user DN's from the backup
the day before against the one after and they're all still there modulo
5 that were deleted intentionally.
I haven't been overly concerned about this issue, while it's been
annoying, it hasn't had much of a production impact. But it just turned
ugly :(, assuming this was caused by the crash, and I can't see how it
wouldn't have been...
<sigh>. Is there any logging or something I could turn on that would
help for the next occurance but wouldn't be an excessive impact in terms
of load or disk utilization on a production box?
Thanks...