OpenLDAP ITS:
OK the issue is looking more and more like buggy slapd behavior.
I have now narrowed down the issue to an instance where slapd is
partially "hung" and will not stop nor restart. This directly
correlates to replication breaking because SLAPD is breaking:
Oct 11 18:22:08 server1 slapd[10771]: <= bdb_equality_candidates:
(uidNumber) not indexed
Oct 11 18:22:08 server1 slapd[10771]: <= bdb_equality_candidates:
(gidNumber) not indexed
Oct 11 18:22:08 server1 slapd[10771]: <= bdb_equality_candidates:
(uidNumber) not indexed
Oct 11 18:46:41 server1 slapd[10771]: <= bdb_equality_candidates:
(uidNumber) not indexed
Oct 11 18:48:37 server1 slapd[10771]: <= bdb_equality_candidates:
(uidNumber) not indexed
Oct 11 18:49:05 server1 slapd[10771]: daemon: shutdown requested and
initiated.
Oct 11 18:49:05 server1 slapd[10771]: slapd shutdown: waiting for 0
operations/tasks to finish
As you can see, the exact time when this occurs doesn't bring anything
interesting to the logs. You can see the repeated string of index
warnings (not an issue, just haven't indexed this attribute yet)
followed by my attempt to restart slapd when I receive a notification
indicating there is a replication discrepancy.
I have grepped through all of my logs (dmesg, debug, syslog) for
anything related to slapd. What you see above is the more interesting
of the hits returned.
PLEASE help -- the issue is getting more serious now, and by the
evidence I've presented, is looking more and more out of my control.
You've seen my config - can anyone think of why this would happen? It
seems vaguely like a locking issue ....
Thanks
Jeff