Emmanuel Lécharny wrote:
Le 11/05/15 22:17, Howard Chu a écrit :
> There are two main problems:
> 1) the AVL tree used for presentlist is still extremely inefficient
> in both CPU and memory use.
> 2) the consumer does twice as much work for a single modification as
> the provider. I.e., the consumer does a write op to the backend for
> the modification, and then a second write op to update its contextCSN.
Updating the contextCSN is an extra operation on the consumer, but
you have to update potentially tens of indexes when updating an entry
(on both teh consumer and the producer), it's not really twoce more
work. It's an additianal operation, but that would not double the time
it costs on the producer.
You're forgetting one very important thing - each operation is a single
transaction in the backend, and transactions are synchronous by default. The
main cost is not the indexing, it's the txn fsync, and yes, it is twice the
cost when you're doing two txns instead of just one.
The question would be : how do we update the contextCSN only
periodically, to mitigate this extra cost, and it seems you proposed to
batch the updates for this reason. By using btaches of 500 updates, this
extra cost will be almost unnoticable, and one would expect the work on
the consumer to be the same as on the producer side, right ?
> The provider only does the original modification, and caches the
> contextCSN update.
> If we fix both of these issues, consumer speed should be much faster.
> Nothing else is worth investigating until these two areas are reworked.
Agreed in most of the case. Although for use cases of an important
number of updates have occured while a consumer is off line, another
strategy might work. That this other strategy is to stop the consumer,
slapcat the producer, slapadd the result and restart the server, all
with the command line, instead of having it implemented in the server
code, was what I was suggesting, but this is another story for a corner
case that is not frequent. Plus we don't know at which point this would
be the correct strategy (ie, for how many updates should we consider it
as a better startegy than the current implementation ?).
If the consumer has been offline for a long time, then this discussion is
moot. No clients will be looking at it, so the risk of serving out-of-date
information to clients is zero. In that case, it doesn't matter what strategy
you use, they'll all work.
> For (1) I've been considering a stripped down memory-only
> LMDB. There are plenty of existing memory-only Btree implementations
> out there already though, if anyone has a favorite it would probably
> save us some time to use an existing library. The Linux kernel has one
> (lib/btree.c) but it's under GPL so we can't use it directly.
Q : do you need to keep the presentList ina BTree at all ?
Good question. We process it by doing a single search over the target range,
and removing presentlist entries for each entry returned by the search. Since
the search order is random, we want fast search access to the presentlist.
We could alternatively do a dynamic array and walk the presentlist in order,
doing (entryUUID=x) searches on each element. The overhead of doing X
individual searches is worse than doing one global search though.
>> Another point : as soon as the server is restarted, it can
>> incoming requests, which will send back outdated response, until the
>> refresh is completed (and i'm not talking about updates that could also
>> be applied on an outdated base, with the consequences if there are some
>> missing parents). In many cases, that would be a real problem, typically
>> if the LDAP servers are considered as part of a shared pool of server,
>> with a load balance mecahnism to spread the load. Wouldn't be more
>> realistic to simply consider the server as not available until the
>> refresh phase is completed ?
> This was ITS#7616. We tried it and it caused a lot of problems. It has
> been reverted.
The two options were to either send a referral (not ideal, as we have no
control whatesoever on the client API) and LDAP_BUSY. A third option
would be possible : chaining the request to the server from which the
replication updates are coming from. Doing so will guarantee that the
client will gets a updated version of the data, as the producer is up to
date. There is still an issue though if both servers are replicating
each other (pretty much the pb with referrals). OTOH, if the other
server is also in refresh mode, it should be possible to return a
LDAP_BUSY if it is capable of detecting that the requests come from
another server, not for a client. Maybe it's far fetched...
In practice, two MMR servers pointed at each other would never make progress.
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/