Brett @Google wrote:
On Thu, Nov 5, 2009 at 4:20 AM, Howard Chu <hyc@symas.com mailto:hyc@symas.com> wrote:
> Out of interest, has the syncrepl UUID / CSN format changed much between > 2.4.16 stable and 2.4.19 stable ? There have been no format changes. You should have been able to run 2.4.19 directly on the original database. No idea what issue you ran into.
The problem came back, but I think i found the root cause. We have a batch script which starts slapd, it has always been the same on both the producers and consumers. The "-w" slapadd option was being provided in this script, for both the consumer and the producer. This has been the case for quite some time (2.3.x and 2.4.16), but it only recently became an issue after the upgrade to 2.4.19. The sync problem would then appear or disappear randomly, depending of which of the provider or consumer was loaded first.
Is it patently wrong to provide the "-w" option on the consumer, or is this a bug ?
Yes, it's generally wrong to use the -w option.
slapadd is intended for use with LDIF that was produced with slapcat. When you are slapcat'ing a database that has already been in use as a sync provider, then there should already be a valid contextCSN in the database and it will already be present in the LDIF. Once you have this LDIF from slapcat you should be able to load it on any other server without using -w.
The only correct usage of -w is when you have an LDIF produced by slapcat on a database that was not being used with replication before, and so is missing the contextCSN. But all the other operational attributes (entryUUID and entryCSN in particular) are present and valid.
If you have an LDIF file that contains no operational attributes, it is very likely that you ought to load it using ldapadd. If your LDIF file did not come from a known good slapd server, then most likely it needs all of the additional validation checks that are performed through ldapadd.
Note that slapadd was changed in 2.4.19 to warn if you're loading a replica and entryCSN or entryUUID are missing.
At the very least it is probably a usability "gotcha", if not a bug.
If this is an issue perhaps slapadd should refuse to do "-w" on a shadow context, or warn and not actually perform the option ?
If the contextCSN was already present, and actually matches the max entryCSN present in the LDIF, then the write that -w performs should be effectively a no-op. If there were no entryCSNs in the LDIF then the value written by -w will just depend on the current time of day when the last entry got processed by slapadd.