On Tue, 20 Jul 2010, masarati@aero.polimi.it wrote:
It turned out that the object cn=admin,dc=foo,dc=no had multiple occurances of "objectClass: organizationalRole" (!), and this also prevented syncrepl from working. I suspect it was a result of "manual" editing of ldif files followed by an import using slapadd. I get no warnings from slapadd when I import import objects with multiple occurances of the same objectClass.
Perhaps slapadd/slapd should be able to deal with such duplicate entries better, to make it more obivous what's wrong? I'm just saying :)
slapd(8) can handle those occurrences.
But does it handle it good enough, when it prevents replsync from working?
This is a side effect: the replica receives bogus data via the protocol, and spits it.
slapadd(8) is intended to load LDIF files generated by slapcat(8), thus presumably consistent.
And the file was indeed LDIF file generated by slapcat.
I mean: from slapcat of a sane database.
Since slapd allows it, slapcat will also spit it out - when slapcat, slapadd and slapd all "handle it" without giving any warnings back to anyone, it's not so easy to detect errors.
No, you miss one link: slapd did not handle it (I mean: through protocol). When slapd starts up and opens a database, it does not validate its content, of course. And when it returns an entry, it does not validate its contents. Only when a write is performed, the contents are validated (usually, only the bit that's being written, if it's a modify).
In general, it deals with the most obvious errors. I don't think asking slapadd to perform these checks is a good idea, as it would slow it down without real benefit: if an error is caught, you would need to restart, wasting all the actual write effort.
I don't quite agree - as I understand it slapadd already does some sanity checking, how much overhead would a check for objectClass doublets imply?
Why don't you code and test it yourself? Checking for duplicates requires to normalize data and compare each value to eachother. A wise implementation has quadratic cost (n*(n-1)/2 comparisons). You were offended by a duplicate objectClass issue this time. If next time it happens to a group with 10,000 members, you'll be whining that your groups are perfectly sane, why does it take so long to load your LDIF?
And I dont see why you would need to restart, on a doublet either spit out a warning, or even better - spit out a warning and discard the doublet.
Those are implementation details; in many cases, the database needs to be complete - no holes; so if slapadd spits an entry, it may not be able to add its children.
A sanity check tool for unreliable LDIF would probably be more appropriate. I guess at this point most users would pretend their LDIF is always reliable, and avoid running the sanity checker...
Really? Yes, I would love a sanity checker, and I would most likely _always_ run LDIF through a sanity checker before using slapadd to write to back-end.
But again - slapadd already does some sanity checking,
Usually, as much as it's strictly required to properly perform its own task - regenerate a presumably sane database.
and there's even a flag for "dry-run" mode (-u) which IMO says that it is supposed to be used as a sanity checking tool. I'm perfectly OK to let _all_ sanity checks only occure when using -u.
Embedding the sanity checker in slapadd is an option, indeed. Not the default, IMHO.
I would love to dump all my ldap data to an LDIF and run it through a sanity checker, I suspect there's more "old noise" stuck in there.
Task separation is at the roots of clean programming - and system administration.
p.