https://bugs.openldap.org/show_bug.cgi?id=9282
--- Comment #8 from Ondřej Kuzník <ondra(a)mistotebe.net> ---
On Thu, Jul 02, 2020 at 01:19:40PM +0000, openldap-its(a)openldap.org wrote:
--- Comment #7 from Howard Chu <hyc(a)openldap.org> ---
(In reply to Ondřej Kuzník from comment #6)
> Thanks for the reproducer script.
>
> This is due to
>
https://git.openldap.org/openldap/openldap/-/blob/master/servers/slapd/
> syncrepl.c#L1638 causing A to skip the present cull.
>
> Based on the git history, this was introduced to deal with ITS#5470 but that
> seems wrong, if the number of SIDs in the cookie differs from what we
> requested then either:
> - a SID disappeared from the set we received, which sounds like what
> ITS#5470 is about? But slapd doesn't really allow this at the moment as it
> will say consumer is newer than provider) so that shouldn't happen
A SID can't disappear. They tend to stay in the contextCSN forever. (This is
actually another problem, nodes that are converted from single-provider to
multi-provider generally still have a SID 0 CSN, which is always ancient
relative to the active SIDs. Routines that check for oldest CSN to still exist
in the DB lead to wasteful checks because of that. Right now all you can do is
use mage privs and delete the obsolete CSN.)
Yeah, and it would not be so wasteful if we could query the database for
the oldest/newest entry with a given SID in entryCSN. Removing a SID
from the set is always going to be a manual operation unless we can
coordinate with all provider and consumer nodes somehow.
> - a SID is added to the set by the provider, like here. This
could be due to
> a delete (like here) and that delete has to be replicated - that is the
> point of running syncrepl_del_nonpresent
Yes, the problem that was being addressed is that if the local node knows about
more SIDs than the remote node, then the incoming present list from the remote
node can't be trusted. Doing a del_nonpresent could delete a lot of entries
that the remote node doesn't know about, but exist legitimately on the local
node.
The scenario I describe here is if we start a search with a cookie
containing only SIDs {1, 2} but finish present phase by receiving a
cookie with SIDs {1, 2, 3}. Accepting that cookie implies we have to
process the (implied) deletes too or we have desynced.
If, in the meantime, we added entries with a SID of 4, those are not
part of the original cookie and should not be deleted, that's for sure.
I think we do the right thing already or are close to doing so.
I think a proper fix would require a change in the syncrepl protocol
sequencing. E.g., two nodes should refresh from each other with all of their
new Adds/Modifies first, and once those changes have been settled, then they
can perform a present cross-check. This would also require saving some
intermediate cookie state in case the the full sequence gets interrupted.
Or, put in another way, there needs to be a separately tracked
contextDeleteCSN.
That's ITS#8125 work, I should get back to that eventually.
--
You are receiving this mail because:
You are on the CC list for the issue.