--On Tuesday, October 15, 2013 11:47 PM +0200 Bruno Marcon marcon.bruno@free.fr wrote:
I have found the origin of the memory increase. The problem occur with a huge group and the RefreshAndPersist mode replication without using DELTA Sync. So each slaves pull from the master the entire entry.
The LDAP contains about 100.000 users. All of them are member of a group name "XXX". The group XXX is a group with 100.000 attributes "member", one for each user.
So when I assign 20 users per seconds to the XXX group, the master LDAP push (or the slaves pull ?) the group XXX 20 times per seconds for each slaves (there is 5 slaves).
5x20x100.000 = 10.000.000 attributes per seconds... The master LDAP handle this rate for several seconds, perhaps one minute, then suddenly increase in memory and is killed by the OS at 2Go of memory.
To avoid this, i have set the DELTA sync replication to only send the modification : for each assignition of a user in the group, only the attribute member for the user is replicate.
A question is why the LDAP is increasing in memory ? it should be slowed rather than store something in memory after having reach a limit ?
Our main problem now is : we have two masters in mirror mode, and 4 replicas. For the replicas, I can use the DELTA sync mode. For the recovery master (for fail over), I can't use DELTA sync because this mode is not supported in Mirror mode. So we have configure the masters replication in RefreshOnly mode with a polling period of 5 seconds, to avoid this memory increase in RefreshAndPersist mode without DELTA sync. Doing this, the recovery master have 5s of delay before replicate the primary master, so if the primary master crash for any reason, the recovery master may have missed up to 5s of modifications (users creation for example). BUT replicas are up to date with the old primary master are more up to date than the recovery master and may have more users, because there replication is immediate (RefreshAndPersist mode).
If we can restart the "old" primary master, the recovery master get the missing users thanks to replication. If we can't restart the primary master because it is KO, we have a recovery master with less users than slave, so we have some onconsistency between replicas and the recovery master.
Do you have any solution to this problem ? It occures with big LDAP database and hight rate of modifications on the master because we work on a huge project with thousands of users (and perhaps in the future millions of users).
Regards, Bruno Marcon.
-----Message d'origine----- De : MARCON Bruno [mailto:Bruno.MARCON@thalesgroup.com] Envoyé : mercredi 9 octobre 2013 14:04 À : Bruno Marcon Objet : RE: (ITS#7717) Sudden memory increase leading to Master LDAP crash
Bruno Marcon ThereSIS Innovation lab, ICT Security Unit
- IT Security Expert
- Software Architect
Thales Research & Technology Campus Polytechnique 1, avenue Augustin Fresnel 91767 Palaiseau cedex France Office: +33 (0)1 69 41 60 96 Fax: +33 (0)1 69 41 55 63
-----Message d'origine----- De : Bruno Marcon [mailto:marcon.bruno@free.fr] Envoyé : mardi 8 octobre 2013 18:05 À : MARCON Bruno Objet : TR: (ITS#7717) Sudden memory increase leading to Master LDAP crash
-----Message d'origine----- De : Quanah Gibson-Mount [mailto:quanah@zimbra.com] Envoyé : lundi 7 octobre 2013 16:56 À : marcon.bruno@free.fr; openldap-its@openldap.org Objet : Re: (ITS#7717) Sudden memory increase leading to Master LDAP crash
--On Thursday, October 03, 2013 1:14 PM +0000 marcon.bruno@free.fr wrote:
The problem is very easy to replay : one master and one slave are sufficient. Then run 9 windows with a script adding an entry and modifying an entry.
Please provide full configuration details.
--Quanah
--
Quanah Gibson-Mount Architect - Server Zimbra Software, LLC
Zimbra :: the leader in open source messaging and collaboration
--
Quanah Gibson-Mount Architect - Server Zimbra, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration