So... I'm having a problem with persistent corruption in Apple's Open Directory. I believe this corruption is related to OpenLDAP and the BerkeleyDB. I was hoping that folks here might be able to help me track down whether this is the problem or not.
Essentially, what is happening is that user accounts will "disappear" from workgroup manager and dscl[1]. Accounts that have maintained a persistent connection will continue to be authenticated. But, accounts that are not authenticated will be unable to authenticate. The Directory Administrator account, for example, cannot authenticate at these times. If I restart slapd, all the missing accounts that had persistent connections will no longer be able to authenticate.
An LDIF export, however, will show that the accounts are all still there.
A regular repair and a catastrophic repair of of the BerkleyDB does not work.[2] The first time this happened, it DID work, but subsequent events have not been so easily fixed.
A restore from backup is the only way to fix it. However, I suspect that there is malformed data lurking somewhere in the OpenLDAP system. The backups all have this malformed data. Thus, it doesn't take very much for the system to get corrupted again. A hard shutdown does it every time, and a minor upgrade to the OS did it, too.
The standard suggested fix is destroy and rebuild the Open Directory setup. For obvious reasons, I would like to avoid this. I want to know *what* is happening.
If it is, in fact, malformed data that is becoming corrupt, *what* data should I be examining, *where* is it located, and *how* do I check it for anomalies?
Has anyone else had this kind of persistent corruption of their LDAP system? What was causing it? How did you find it?
Any leads or words of wisdom would be greatly appreciated.
Gilbert Wilson
[1] http://developer.apple.com/documentation/Darwin/Reference/ManPages/man1/dscl... [2] http://developer.apple.com/documentation/Darwin/Reference/ManPages/man1/db_r...