Howard Chu wrote:
Emmanuel Lécharny wrote:
> Le 03/02/15 09:41, Howard Chu a écrit :
>> Emmanuel Lécharny wrote:
>>> Le 03/02/15 05:11, Howard Chu a écrit :
>>>> Another option here is simply to perform batching. Now that we have
>>>> the TXN api exposed in the backend interface, we could just batch up
>>>> e.g. 500 entries per txn. much like slapadd -q already does.
>>>> Ultimately we ought to be able to get syncrepl refresh to occur at
>>>> nearly the same speed as slapadd -q.
>>> Batching is ok, except that you never know how many entries you'll
>>> to have, thus you will have to actually write the data after a
>>> period of
>>> time, even if you don't have the 500 entries.
>> This isn't a problem - we know exactly when refresh completes, so we
>> can finish the batch regardless of how many entries are left over.
> True for Refresh. I was thinking more specifically of updates when we
> are connected.
None of this is for Persist phase, I have only been talking about refresh.
>> Testing this out with the experimental ITS#8040 patch - with lazy
>> commit the 2.8M entries (2.5GB data) takes ~10 minutes for the refresh
>> to pull them across. With batching 500 entries/txn+lazy commit it
>> takes ~7 minutes, a decent improvement. It's still 2x slower than
>> slapadd -q though, which loads the data in 3-1/2 minutes.
> Not bad at all. What makes it 2x slower, btw?
Still looking into it. slapadd -q uses 2 threads, one to parse the LDIF
and one to write to the DB. syncrepl consumer only uses 1 thread.
Probably if we split reading from the network apart from writing to the
DB, that would make the difference.
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/