Re: syncrepl consumer is slow

28 Jan 2015

      Le 29/01/15 04:12, Howard Chu a écrit :
...
One thing I just noticed, while testing replication with 3 servers on
my laptop - during a refresh, the provider gets blocked waiting to
write to the consumers after writing about 4000 entries. I.e., the
consumers aren't processing fast enough to keep up with the search
running on the provider.
(That's actually not too surprising since reads are usually faster
than writes anyway.)
The consumer code has lots of problems as it is, just adding this note
to the pile.
I'm considering adding an option to the consumer to write its entries
with dbnosync during the refresh phase. The rationale being, there's
nothing to lose anyway if the refresh is interrupted. I.e., the
consumer can't update its contextCSN until the very end of the
refresh, so any partial refresh that gets interrupted is wasted effort

the consumer will always have to start over from the beginning on

its next refresh attempt. As such, there's no point in
safely/synchronously writing any of the received entries - they're
useless until the final contextCSN update.
The implementation approach would be to define a new control e.g.
"fast write" for the consumer to pass to the underlying backend on any
write op. We would also have to e.g. add an MDB_TXN_NOSYNC flag to
mdb_txn_begin() (BDB already has the equivalent flag).
This would only be used for writes that are part of a refresh phase.
In persist mode the provider and consumers' write speeds should be
more closely matched so it wouldn't be necessary or useful.
Comments?
The proposal sounds sane.
Speaking of which we had a discussion about some other features that
could be fine to have : when a consumer reconnect to a provider, the
consumer has no idea about how many entries it will receives. It would
be valuable to pass an extra information in the exchanged cookie, which
would be the number of updated entries. That could provide a hint for
users or admin who would like to know about how long the update would
take on a consumer (assuming we log such an information). Also batching
the updates in the backend, ie grouping the updates before syncing them,
could be interesting to have, still associated with some logs, again
allowing the admin/user to know about the update progression.
Something like:
syncrepl : 1240 entries to update
syncrpel : 200/1240 entries updated
syncrpel : 400/1240 entries updated
...
syncrepl : server up to date.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: syncrepl consumer is slow