Re: syncrepl consumer is slow

12 May 2015


      Le 12/05/15 14:34, Howard Chu a écrit :
...
Emmanuel Lécharny wrote:
...
Le 11/05/15 22:17, Howard Chu a écrit :
...
There are two main problems:

the AVL tree used for presentlist is still extremely inefficient

in both CPU and memory use.
  2) the consumer does twice as much work for a single modification as
the provider. I.e., the consumer does a write op to the backend for
the modification, and then a second write op to update its contextCSN.
...
Updating the contextCSN is an extra operation on the consumer, but as
you have to update potentially tens of indexes when updating an entry
(on both teh consumer and the producer), it's not really twoce more
work. It's an additianal operation, but that would not double the time
it costs on the producer.
You're forgetting one very important thing - each operation is a
single transaction in the backend, and transactions are synchronous by
default. The main cost is not the indexing, it's the txn fsync, and
yes, it is twice the cost when you're doing two txns instead of just one.
Good point.
...
...
The question would be : how do we update the contextCSN only
periodically, to mitigate this extra cost, and it seems you proposed to
batch the updates for this reason. By using btaches of 500 updates, this
extra cost will be almost unnoticable, and one would expect the work on
the consumer to be the same as on the producer side, right ?
Yes.
...
...
The provider only does the original modification, and caches the
contextCSN update.
If we fix both of these issues, consumer speed should be much faster.
Nothing else is worth investigating until these two areas are reworked.
...
Agreed in most of the case. Although for use cases of an important
number of updates have occured while a consumer is off line, another
strategy might work. That this other strategy is to stop the consumer,
slapcat the producer, slapadd the result and restart the server, all
with the command line, instead of having it implemented in the server
code, was what I was suggesting, but this is another story for a corner
case that is not frequent. Plus we don't know at which point this would
be the correct strategy (ie, for how many updates should we consider it
as a better startegy than the current implementation ?).
If the consumer has been offline for a long time, then this discussion
is moot. No clients will be looking at it, so the risk of serving
out-of-date information to clients is zero. In that case, it doesn't
matter what strategy you use, they'll all work.
Another good point. It would require a use that send a hell lots of
updates while no client is having activity, and a disconnected consumer
- all three conditions at the same time - to face my scenario. Quite
rare. I had in mind this user who was updating his database with
millions of updates once in a while (say, once a year), during night,
and who find than teh consumer is not up and running in the morning.
Not sure then it worth the effort to find a way to mitigate such corner
case.
...
...
...
For (1) I've been considering a stripped down memory-only version of
LMDB. There are plenty of existing memory-only Btree implementations
out there already though, if anyone has a favorite it would probably
save us some time to use an existing library. The Linux kernel has one
(lib/btree.c) but it's under GPL so we can't use it directly.
Q : do you need to keep the presentList ina BTree at all ?
Good question. We process it by doing a single search over the target
range, and removing presentlist entries for each entry returned by the
search. Since the search order is random, we want fast search access
to the presentlist.
We could alternatively do a dynamic array and walk the presentlist in
order, doing (entryUUID=x) searches on each element. The overhead of
doing X individual searches is worse than doing one global search though.
If the goal is to find all the entries that are not present in the DB,
wouldn't it be faster to simply quick sort the entryUUID we have
received? Both algorithms (AVL insertions and quickSort) are in O(n x
Log(n)) - if you except the possibility that the quicksort degenerates
in O(n2), of course - but Quicksort is faster than AVL when it comes to
order a set of values.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: syncrepl consumer is slow