Hello,
I've a problem with the speed of replication.
I've set up openldap 2.4.33 with a Master and one consumer. At the moment the full replaction takes abaout 32hours. No LDAP operations are made on master or consumer during this time. (I know, i depends on Hardware too, but the two servers are fast )
How long should it need, to replicate a DB from about 6GByte (id2entry.bdb + dn2id.bdb) with 1.6M DN's and about 66M Attributes. Replication is configured with RefreshAndPersist, no DeltaSync. Both servers are on the same IP segment, connected via gigabit ethernet switch.
I played in test environment with different parameters: - shm_key - dbnosync - switched off all indexes on consumer except entryUUID and entryCSN - different bdb cachesize - noatime, relatime - ext3/xfs
I locked on disk via iostat (nothing seen), no io waits with top, looked on network, but max 5Mbit/s is used, I listen with strace on slapd and I see, that slapd is reading from Network and wrinting it to id2entiry.bdb.
Before each Test, I deleted complete ldap db (except DB_CONFIG) and shared memory ipcrm -m
Are there similar limitations, that will trigger slow replication like BDB_IDL_LOGN? How can I accelerate this Replication. I'm of the opinion that it was significantly faster with a smaller database.
Thanks and kindly regards Meike
Configuration: --------------------
Configuration is only a test configuration, some values differs, some are commented out because of playing with them.
# Master (Provider) ========================================================== include /etc/openldap/schema/core.schema include /etc/openldap/schema/cosine.schema include /etc/openldap/schema/inetorgperson.schema include /etc/openldap/schema/yast.schema include /etc/openldap/schema/rfc2307bis.schema
pidfile /var/run/slapd/slapd.pid argsfile /var/run/slapd/slapd.args
modulepath /usr/lib/ldap moduleload back_bdb moduleload syncprov
sizelimit -1 timelimit 300
tool-threads 8 threads 8
serverID 001
######################################## database bdb suffix "ou=root" rootdn "cn=admin,ou=root"
#loglevel stats sync loglevel 0 rootpw <password> directory /DATA/ldap
#cachesize 500000 #dncachesize 500000 #idlcachesize 150000 cachefree 500
dirtyread dbnosync shm_key 7
checkpoint 4096 15
index objectClass,entryUUID,entryCSN eq index cn eq,sub index ownattributes ....
overlay syncprov syncprov-checkpoint 100 5
# Consumer ========================================================== include /etc/openldap/schema/core.schema include /etc/openldap/schema/cosine.schema include /etc/openldap/schema/inetorgperson.schema include /etc/openldap/schema/yast.schema include /etc/openldap/schema/rfc2307bis.schema
pidfile /var/run/slapd/slapd.pid argsfile /var/run/slapd/slapd.args
modulepath /usr/lib/ldap moduleload back_bdb moduleload syncprov
sizelimit -1 timelimit 300
serverID 002
#loglevel stats sync loglevel 0
######################################## database bdb suffix "ou=root" rootdn "cn=admin,ou=root"
checkpoint 4096 15 rootpw <password> directory /DATA/ldap
dbnosync
shm_key 7
checkpoint 4096 15
#cachesize 100000 #dncachesize 100000 #idlcachesize 150000 #cachefree 500 #dirtyread
syncrepl rid=020 provider=ldap://192.168.1.10 type=refreshAndPersist retry="5 5 300 +" searchbase="ou=root" attrs="*,+" bindmethod=simple binddn="cn=admin,ou=root" credentials=<password>
index entryUUID,entryCSN eq #index cn eq,sub
mirrormode FALSE
On Wed, 24 Apr 2013, Meike Stone wrote:
I've set up openldap 2.4.33 with a Master and one consumer. At the moment the full replaction takes abaout 32hours.
syncrepl really isn't intended for initial "full" loads, although it will work eventually (as you've seen). The preferred method for standing up an offline server is slapadd -q. syncrepl can then handle deltas since the LDIF was generated; this should complete fairly rapidly.
I'm of the opinion that it was significantly faster with a smaller database.
Rule of thumb, the less data to process, the less time it takes...
syncrepl really isn't intended for initial "full" loads, although it will work eventually (as you've seen). The preferred method for standing up an offline server is slapadd -q. syncrepl can then handle deltas since the LDIF was generated; this should complete fairly rapidly.
Ok, sound logical, but if I use slapcat on a running slapd with big db, is it guaranteed, that the resulting ldif is consistent and will work after slapadd?
Second is, I used a 3h old ldif from slapcat on the consumer and the replication needed about 6h for the resync (same ContextCSN). On the provider (master) I've set loglevel 256. So I used the script ldap-stats.pl (http://prefetch.net/code/ldap-stats.pl.html), to determine how many changes are made in this hour (about 5000/h). The server hardware itself is idling the whole time (cpu/disk/nework). On the consumer the CPU is running on nearly 100%.
Is it possible, the determine in which state (present phase, delete phase,) the consumer is staying (e.g. monitoring db).
Is it possible to simulate the present phase with ldapsearch, to look if the provider needs so long and if, what part (entries updated or unchanged entry ) needs so long?
I'm of the opinion that it was significantly faster with a smaller database.
Rule of thumb, the less data to process, the less time it takes...
But presumably non-liniar ...
Thanks Meike
Meike Stone schrieb (26.04.2013 14:34 Uhr):
Is it possible to simulate the present phase with ldapsearch, to look if the provider needs so long and if, what part (entries updated or unchanged entry ) needs so long?
look at # man ldapsearch for "-E" and sync=rp[/<cookie>][/<slimit>] "(LDAP Sync refreshAndPersist)" cookie is something like "rid=${RID},csn=${CSN}
But I'm not sure, it does what you want.
Marc
2013/4/26 Marc Patermann hans.moser@ofd-z.niedersachsen.de:
Meike Stone schrieb (26.04.2013 14:34 Uhr):
Is it possible to simulate the present phase with ldapsearch, to look if the provider needs so long and if, what part (entries updated or unchanged entry ) needs so long?
look at # man ldapsearch for "-E" and sync=rp[/<cookie>][/<slimit>] "(LDAP Sync refreshAndPersist)" cookie is something like "rid=${RID},csn=${CSN}
But I'm not sure, it does what you want.
For that point yes, thanks.
I tried it: - got the ContextCSN on server via ldapsearch -x -h localhost -w password -D cn=admin,ou=root -bou=root -s base contextCSN -LLL - waited for about 20min - startet a refreshOnly ldapsearch -x -h localhost -wpassword -D"cn=admin,ou=root" -b"ou=root" -s sub -E sync=ro/rid=103,csn=20130426125054.388178Z#000000#001#000000/0 and got the whole directory back.
I thought, only modified entries are transmitted completely and unmodified entires are empty (plus entryUUID) is sent?
Is this check valid? If I use slapd, get the contextCSN, do nothing modify, and start the ldapsearch -E ..., I should only get back empty entires plus entryUUID? I'm wrong?
Thanks Meike
On Fri, 26 Apr 2013, Meike Stone wrote:
syncrepl really isn't intended for initial "full" loads, although it will work eventually (as you've seen). The preferred method for standing up an offline server is slapadd -q. syncrepl can then handle deltas since the LDIF was generated; this should complete fairly rapidly.
Ok, sound logical, but if I use slapcat on a running slapd with big db, is it guaranteed, that the resulting ldif is consistent and will work after slapadd?
It should be as consistent as any other reading client -- it's explicitly intended to work with slapadd.
Second is, I used a 3h old ldif from slapcat on the consumer and the replication needed about 6h for the resync (same ContextCSN). On the provider (master) I've set loglevel 256. So I used the script ldap-stats.pl (http://prefetch.net/code/ldap-stats.pl.html), to determine how many changes are made in this hour (about 5000/h). The server hardware itself is idling the whole time (cpu/disk/nework). On the consumer the CPU is running on nearly 100%.
You might want to verify your syncprov and syncrepl syncdata= settings are appropriate for your workload.
Is it possible, the determine in which state (present phase, delete phase,) the consumer is staying (e.g. monitoring db).
debug level sync should give you a hint at to where the consumer is.
Is it possible to simulate the present phase with ldapsearch, to look if the provider needs so long and if, what part (entries updated or unchanged entry ) needs so long?
You should be able to form the controls with ldapsearch, but I don't believe ldapsearch can create a persistent client. I'm pretty sure that there are freestanding clients out there if you want to play around with refreshAndPersist.
I'm of the opinion that it was significantly faster with a smaller database.
Rule of thumb, the less data to process, the less time it takes...
But presumably non-liniar ...
Thanks Meike
openldap-technical@openldap.org