On Tue, Jun 14, 2022 at 01:40:56PM +0200, Ondřej Kuzník wrote:
It's becoming untenable how a plain refresh cannot be represented in accesslog in a way that's capable of serving a deltasync session. Whatever happens, we have lost a fair amount of information to run a proper deltasync yet if we don't want to abandon this functionality, we have to try and fill some in.
Had a discussion on how to address this, some proposals have been floated so far, I'll broadly keep them to one per thread. This is one of them, call it proposal 1 if you will.
Changes are contained to the provider: During a refresh (both present and delete phase), we send entries and deletes ordered by CSN, this makes their consumer's accesslog in better shape for deltasync consumption. Currently we send deletes first, then updates if we run a delete phase. This requires that we change that[0], mixing output from sessionlog and internal search, which for accesslog based sessionlog calls for two concurrent searches being run while processing a single operation, that's currently impossible. And we need to provide an efficient server side sorting implementation that does this without reading all entries first, then sorting them.
This way we never explicitly send changes out of order. Except that there remains an implicit out of order change at the end of a present phase, the consumer still has to log those implied deletes somehow and choosing any single CSN leads to scenarios where divergence is still inevitable. On the other hand, any part of this change can be introduced at the same time as most of the other proposals.
[0]. RFC 4533 is extremely vague about when updates are sent at all during refresh, so it seems we are within our rights to do this.