daniel@ncsu.edu wrote:
daniel@ncsu.edu wrote:
dn: uid=USERNAME,ou=students,ou=people,dc=ncsu,dc=edu
Ya know what! Now that you said, that, I actually -do- have a copy of the data in ldif dump format. (pulled via slapcat) Is that what would be useful to see?
Yes, the LDIF for the uid=USERNAME entry that first showed the problem would be a good start.
I XXX'd out everything that indicated who it was. ;)
dn: uid=XXX,ou=students,ou=people,dc=ncsu,dc=edu objectClass: person objectClass: inetOrgPerson objectClass: ncsuPerson uid: XXX cn: XXX sn: XXX title: Senior ncsuTwoPartName: XXX organizationalStatus: registered o: NC State University gn: XXX initials: XXX displayName: XXX ncsuAltDisplayName: XXX ncsuCampusID: XXX ncsuClassCode: SR ou: Physics ncsuCurriculumCode: PY ou: B S - Philosophy ncsuCurriculumCode: LSL mail: XXX@unity.ncsu.edu ncsuPrimaryEMail: XXX@unity.ncsu.edu registeredAddress: XXX postalAddress: XXX telephoneNumber: XXX l: Raleigh st: NC postalCode: 27603 ncsuPrimaryRole: staff
happen. What commands did you use to initially populate the database? If you
I used slapadd to populate the database initially.
No other mods were done to the entry in the meantime?
Nope, not a one. =(
repeat that (on a separate, fresh database) does the same problem recur?
I've test that theory if I can find some resources to try it with. (kinda 'strapped' at the moment so to speak)
OK.
Little background, our LDAP service is 100% pulled from other sources. Wiping it and rebuilding it from scratch is always possible. (in other words, no end users make changes directly, the only thing that makes any modifications what-so-ever is the update script we run on the master ldap server which generates ldif files of what -should- be in the database, then via some scripts that come with ... Net::LDAP? some perl module, not remembering off the top of my head... ldifsort, ldifdiff, etc ... I compare the last build to todays build and push the modifications via ldapmodify.
If this occurs again, what all would be useful for me to pull aside before fixing it? I'll make a point of doing so.
Probably a copy of the id2entry database for starters. That's assuming the entry got stored incorrectly on disk. If it only got corrupted in the in-memory cache, we'd have to see the actual Entry structure in memory. That would mean using gdb to break into the flow of things while the bad entry is being accessed. (So you need a debug build of slapd with the full symbol table intact.)
It happened again this morning with another record. I am making a tarball of the entire database at this moment. One thing you brought up there made me think, what if I just restarted slapd.. would that not free up the structure in memory and, if that -is- the problem, resolve it and at least point to that being the problem?
Just tried that and it didn't solve anything. Unfortunately I can't leave it in this state so I'm going to repair the issue for now. But I do have a dump of the busted db.
As a complete aside, you might consider using something like the addpartial overlay in ITS#3593 http://www.openldap.org/its/index.cgi/Contrib?id=3593 instead of sort/diff. Just ldapadd everything and let the overlay figure out the changes. That's essentially what the syncrepl consumer does with the entries it receives from the provider.
I posted separately about this. I really do like that concept, a lot, and hope it gets pulled into openldap 'proper' at some point! That said, right now the sanity checks I'm doing are based off the diffs and since I'm already doing the work . . . =) Might as well use it.
Daniel