Le 29/01/15 04:12, Howard Chu a écrit :
One thing I just noticed, while testing replication with 3 servers on my laptop - during a refresh, the provider gets blocked waiting to write to the consumers after writing about 4000 entries. I.e., the consumers aren't processing fast enough to keep up with the search running on the provider.
(That's actually not too surprising since reads are usually faster than writes anyway.)
The consumer code has lots of problems as it is, just adding this note to the pile.
I'm considering adding an option to the consumer to write its entries with dbnosync during the refresh phase. The rationale being, there's nothing to lose anyway if the refresh is interrupted. I.e., the consumer can't update its contextCSN until the very end of the refresh, so any partial refresh that gets interrupted is wasted effort
- the consumer will always have to start over from the beginning on
its next refresh attempt. As such, there's no point in safely/synchronously writing any of the received entries - they're useless until the final contextCSN update.
The implementation approach would be to define a new control e.g. "fast write" for the consumer to pass to the underlying backend on any write op. We would also have to e.g. add an MDB_TXN_NOSYNC flag to mdb_txn_begin() (BDB already has the equivalent flag).
This would only be used for writes that are part of a refresh phase. In persist mode the provider and consumers' write speeds should be more closely matched so it wouldn't be necessary or useful.
Comments?
The proposal sounds sane.
Speaking of which we had a discussion about some other features that could be fine to have : when a consumer reconnect to a provider, the consumer has no idea about how many entries it will receives. It would be valuable to pass an extra information in the exchanged cookie, which would be the number of updated entries. That could provide a hint for users or admin who would like to know about how long the update would take on a consumer (assuming we log such an information). Also batching the updates in the backend, ie grouping the updates before syncing them, could be interesting to have, still associated with some logs, again allowing the admin/user to know about the update progression.
Something like:
syncrepl : 1240 entries to update syncrpel : 200/1240 entries updated syncrpel : 400/1240 entries updated ... syncrepl : server up to date.