--On Monday, May 11, 2015 8:15 PM +0200 Emmanuel Lécharny elecharny@gmail.com wrote:
Quanah said that in some heavily servers, the only way for the consumer to catch up is to slapcat/slapadd/restart the consumer. I wonder if it would not be a way to deal with server that are to far behind the running server, but as a mechanism that is included in the refresh phase (ie, the restarted server will detect that it has to grab the set of entries and load them, os if a human being was doing a slapcat/slapadd/restart).
A specific example we had in the past was quarterly updates for students @ Stanford, which could push out 10's of thousands of updates to the single-node master. Generally of the 6 slaves, 2-3 would remain current, and the other 3 would fall hours or days behind. Since serving out siginficantly out of date data was not an option, we'd generally have to resort to reloading the ones that got stuck behind to get the sync'd up in a timely fashion.
Another point : as soon as the server is restarted, it can receive incoming requests, which will send back outdated response, until the refresh is completed (and i'm not talking about updates that could also be applied on an outdated base, with the consequences if there are some missing parents). In many cases, that would be a real problem, typically if the LDAP servers are considered as part of a shared pool of server, with a load balance mecahnism to spread the load. Wouldn't be more realistic to simply consider the server as not available until the refresh phase is completed ?
There's already an option for this, new for OpenLDAP 2.5 IIRC, that makes it return LDAP_BUSY or some such until it is "caught up". However, if you enable that option, it always returns this response, which is problematic, because a server may routinely flip between "caught up" and not "caught up". I.e., it is not unusual for a system to be a second or so behind other masters. Here's real world data from a client I just ran:
[zimbra@zm-mmr01 ~]$ ./libexec/zmreplchk Master: ldap://zm-mmr01.client.net:389 ServerID: 1 Code: 6 Status: 0y 0M 0w 0d 0h 0m 1s behind CSNs: 20150504222317.897445Z#000000#001#000000 20150511174531.424005Z#000000#002#000000 20150501181032.360324Z#000000#00a#000000 20150511174535.964334Z#000000#00b#000000 Master: ldap://zm-mmr00.client.net:389 ServerID: 2 Code: 0 Status: In Sync CSNs: 20150504222317.897445Z#000000#001#000000 20150511174531.424005Z#000000#002#000000 20150501181032.360324Z#000000#00a#000000 20150511174535.964334Z#000000#00b#000000 Master: ldap://nvl-mmr10.client.net:389 ServerID: 10 Code: 6 Status: 0y 0M 0w 0d 0h 0m 1s behind CSNs: 20150504222317.897445Z#000000#001#000000 20150511174531.424005Z#000000#002#000000 20150501181032.360324Z#000000#00a#000000 20150511174536.315403Z#000000#00b#000000 Master: ldap://nvl-mmr11.client.net:389 ServerID: 11 Code: 6 Status: 0y 0M 0w 0d 0h 0m 1s behind CSNs: 20150504222317.897445Z#000000#001#000000 20150511174531.424005Z#000000#002#000000 20150501181032.360324Z#000000#00a#000000 20150511174536.315403Z#000000#00b#000000
--Quanah
--
Quanah Gibson-Mount Platform Architect Zimbra, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration