Hi,
I remember a discussion some time ago about the possibility of delaying access to a syncrepl. consumer during the intial DIT load.
I have setup my servers to authenticate replication via sasl method external and the servers respective certificates which is nice.
I have also setup limits and acl to the DIT as using a groupOfNames:
olcLimits: group/groupOfNames/member="cn=replicators,ou=serviceaccounts,dc=cksoft,dc=net" size.soft=unlimited size.hard=unlimited time.soft=unlimited time.hard=unlimited
olcAccess: to * by group/groupOfNames/member="cn=replicators,ou=serviceaccounts,dc=cksoft,dc=net" read
This is all nice and shiny and I like having my ldap consumers configured without cleartext credentials (apart from the cert private key of course) ;)
The fun starts when I delete the DIT from multiple consumers and allow them to resync from masters. If they also resync from each other they could possibly authenticate using sasl external but they would not be allowed access to all of the DIT, all of the attributes or be unlimited using olcLimits as the group might not yet have been replicated.
I also had major fun just reseeding a single server out of a set of 4 causing data loss on the servers that connected to the not yet full synced up server.
This is another situation in which it would be nice to be able to disallow any ldap connections to a consumere while it is in the initial sync phase.
I seem to recall there was discussion in possibly addiing such a feature but my google foo is lacking and I cannot find the discussion.
In case there is not yet such a feauture I am considering firewalling access to slapd during the initial sync phase. Is there any ldap way of reliably detecting that initial sync has completed apart from tailing syslog and looking for csn commit messages ...
Greetings Christian
On Mon, Mar 24, 2014 at 10:11:40AM +0100, Christian Kratzer wrote:
This is another situation in which it would be nice to be able to disallow any ldap connections to a consumere while it is in the initial sync phase.
Any client should be denied during this phase: you do not want to serve incorrect information.
I have slapd startup script that run slapd on an alternate port until replication is in sync, then starts normally. But indeed preventing service while the DIT is incolmplete would be nice.
Emmanuel Dreyfus wrote:
On Mon, Mar 24, 2014 at 10:11:40AM +0100, Christian Kratzer wrote:
This is another situation in which it would be nice to be able to disallow any ldap connections to a consumere while it is in the initial sync phase.
Any client should be denied during this phase: you do not want to serve incorrect information.
I have slapd startup script that run slapd on an alternate port until replication is in sync, then starts normally.
How do you detect that replication is in sync? Do you look at the contextCSN attribute?
Ciao, Michael.
Hi!
Stupid question: If syn is based on entryUUID and entryCSN and objects are transferred in transactions, how can an obsolete or incomplete object exist on a server that is to be synced?
Regards, Ulrich
Michael Strödermichael@stroeder.com schrieb am 24.03.2014 um 12:03 in
Nachricht 53301109.9070703@stroeder.com:
Emmanuel Dreyfus wrote:
On Mon, Mar 24, 2014 at 10:11:40AM +0100, Christian Kratzer wrote:
This is another situation in which it would be nice to be able to
disallow
any ldap connections to a consumere while it is in the initial sync
phase.
Any client should be denied during this phase: you do not want to serve incorrect information.
I have slapd startup script that run slapd on an alternate port until replication is in sync, then starts normally.
How do you detect that replication is in sync? Do you look at the contextCSN attribute?
Ciao, Michael.
Ulrich Windl Ulrich.Windl@rz.uni-regensburg.de wrote:
Stupid question: If syn is based on entryUUID and entryCSN and objects are transferred in transactions, how can an obsolete or incomplete object exist on a server that is to be synced?
My biggest problem are not obsolete or incomplete object, but missing ones after a replica has been offline.
Ulrich Windl wrote:
Hi!
Stupid question: If syn is based on entryUUID and entryCSN and objects are transferred in transactions, how can an obsolete or incomplete object exist on a server that is to be synced?
There cannot be incomplete individual entries. There can of course be incomplete collections of entries. And since refreshes occur in arbitrary order, you may have children objects replicated before their parents.
For a large refresh, an entry may be replicated that gets changed again on the provider while the refresh is in progress, and so the version on the consumer is already out of date/obsolete.
Regards, Ulrich
Michael Strödermichael@stroeder.com schrieb am 24.03.2014 um 12:03 in
Nachricht 53301109.9070703@stroeder.com:
Emmanuel Dreyfus wrote:
On Mon, Mar 24, 2014 at 10:11:40AM +0100, Christian Kratzer wrote:
This is another situation in which it would be nice to be able to
disallow
any ldap connections to a consumere while it is in the initial sync
phase.
Any client should be denied during this phase: you do not want to serve incorrect information.
I have slapd startup script that run slapd on an alternate port until replication is in sync, then starts normally.
How do you detect that replication is in sync? Do you look at the contextCSN attribute?
Ciao, Michael.
Hi,
On Mon, 24 Mar 2014, Ulrich Windl wrote:
Hi!
Stupid question: If syn is based on entryUUID and entryCSN and objects are transferred in transactions, how can an obsolete or incomplete object exist on a server that is to be synced?
if for example the acl on the provider does not show you all attributes because the acl is based on data not yet synced than the provider will give the consumer incomplete objects.
Greetings Christian
Christian Kratzer wrote:
Hi,
On Mon, 24 Mar 2014, Ulrich Windl wrote:
Hi!
Stupid question: If syn is based on entryUUID and entryCSN and objects are transferred in transactions, how can an obsolete or incomplete object exist on a server that is to be synced?
if for example the acl on the provider does not show you all attributes because the acl is based on data not yet synced than the provider will give the consumer incomplete objects.
That makes no sense, since ACLs on the provider aren't dependent on data from any other server. I.e., whether the data is synced or not on a particular consumer won't change the evaluation of ACLs on the provider.
Hm... Unless of course, your ACLs depend on entries living in a back-ldap instance that points at a particular consumer. That would be quite bizarre.
Greetings Christian
Hi,
On Mon, 24 Mar 2014, Howard Chu wrote:
Christian Kratzer wrote:
Hi,
On Mon, 24 Mar 2014, Ulrich Windl wrote:
Hi!
Stupid question: If syn is based on entryUUID and entryCSN and objects are transferred in transactions, how can an obsolete or incomplete object exist on a server that is to be synced?
if for example the acl on the provider does not show you all attributes because the acl is based on data not yet synced than the provider will give the consumer incomplete objects.
That makes no sense, since ACLs on the provider aren't dependent on data from any other server. I.e., whether the data is synced or not on a particular consumer won't change the evaluation of ACLs on the provider.
In my situation the provider itself is still syncing up and the acl is dependent on the full DIT being in place.
Hm... Unless of course, your ACLs depend on entries living in a back-ldap instance that points at a particular consumer. That would be quite bizarre.
I have following:
olcLimits: group/groupOfNames/member="cn=replicators,ou=serviceaccounts,dc=cksoft,dc=net" size.soft=unlimited size.hard=unlimited time.soft=unlimited time.hard=unlimited
olcAccess: to * by group/groupOfNames/member="cn=replicators,ou=serviceaccounts,dc=cksoft,dc=net" read
and following group in the DIT with the mapped sasl identities of the servers:
dn: cn=replicators,ou=ServiceAccounts,dc=cksoft,dc=net objectClass: groupOfNames cn: replicators member: cn=ldap1.cksoft.de,ou=ServiceAccounts,dc=cksoft,dc=net member: cn=ldap2.cksoft.de,ou=ServiceAccounts,dc=cksoft,dc=net ... ... ...
The situation I am getting at is: 1. provider A has the data 1. consumer B is empty and starts to sync up from provider A 2. consumer C is empty and starts up to sync from B. 3. Above group will is not yet be populated on B as it is still empty. 4. B will not apply above olcLimit clause to the connection C is on 5. B will not show all entries or all attributes to C as the acl will not match
I can see above happening quite easily in a 3 or 4 server multimaster cluster when one of them is beeing resynced.
Denying client connections in the initial sync phase is the trivial fix that will enforce consistency.
Greetings Christian
Michael Ströder michael@stroeder.com wrote:
Do you look at the contextCSN attribute?
Yes, I have a loop that tests the master and replica contextCSN, and starts slapd on the standard port once they match.
Christian Kratzer wrote:
I remember a discussion some time ago about the possibility of delaying access to a syncrepl. consumer during the intial DIT load.
I seem to recall there was discussion in possibly addiing such a feature but my google foo is lacking and I cannot find the discussion.
http://www.openldap.org/lists/openldap-technical/201306/msg00235.html
-> http://www.openldap.org/its/index.cgi?findid=7616
Ciao, Michael.
Christian Kratzer wrote:
Hi,
I remember a discussion some time ago about the possibility of delaying
access to a syncrepl. consumer during the intial DIT load.
http://www.openldap.org/lists/openldap-bugs/201308/msg00043.html
Feel free to experiment with it and see whether it suits your need.
Hi,
On Mon, 24 Mar 2014, Howard Chu wrote:
Christian Kratzer wrote:
Hi,
I remember a discussion some time ago about the possibility of delaying
access to a syncrepl. consumer during the intial DIT load.
http://www.openldap.org/lists/openldap-bugs/201308/msg00043.html
Feel free to experiment with it and see whether it suits your need.
yes thanks. I think that was the posting I had in mind. I'll give it a look.
Greetings Christian
Howard Chu wrote:
Christian Kratzer wrote:
I remember a discussion some time ago about the possibility of delaying
access to a syncrepl. consumer during the intial DIT load.
http://www.openldap.org/lists/openldap-bugs/201308/msg00043.html
Feel free to experiment with it and see whether it suits your need.
Why was this undocumented strictrefresh option limited to delta-syncrepl?
Ciao, Michael.
Michael Ströder wrote:
Howard Chu wrote:
Christian Kratzer wrote:
I remember a discussion some time ago about the possibility of delaying
access to a syncrepl. consumer during the intial DIT load.
http://www.openldap.org/lists/openldap-bugs/201308/msg00043.html
Feel free to experiment with it and see whether it suits your need.
Why was this undocumented strictrefresh option limited to delta-syncrepl?
Just expedient at the time, it would take more code to cover other cases. Also it wasn't clear to me that this was actually a good approach, thus the need for other developers to test it and give feedback.
The biggest problem with this approach is that it has global impact - turning off slapd's listeners. On a slapd with multiple independent databases, this would be a bad idea.
Howard Chu wrote:
Michael Ströder wrote:
Howard Chu wrote:
Christian Kratzer wrote:
I remember a discussion some time ago about the possibility of delaying
access to a syncrepl. consumer during the intial DIT load.
http://www.openldap.org/lists/openldap-bugs/201308/msg00043.html
Feel free to experiment with it and see whether it suits your need.
Why was this undocumented strictrefresh option limited to delta-syncrepl?
Just expedient at the time, it would take more code to cover other cases. Also it wasn't clear to me that this was actually a good approach, thus the need for other developers to test it and give feedback.
Hmm, it's some work and risk to change a serious setup to delta-syncrepl.
The biggest problem with this approach is that it has global impact - turning off slapd's listeners. On a slapd with multiple independent databases, this would be a bad idea.
IMO that's not really an issue because 1. you likely have many replicas serving the same set of databases to clients and 2. it's very likely that you're initializing all DBs at once because if deploying a new server, recovering after file system corruption etc.
Anyway having an slapd-internal mechanism is way better than external work-arounds with monitoring contextCSN and using iptables etc.
Ciao, Michael.
Michael Ströder wrote:
Howard Chu wrote:
Michael Ströder wrote:
Howard Chu wrote:
Christian Kratzer wrote:
I remember a discussion some time ago about the possibility of delaying
access to a syncrepl. consumer during the intial DIT load.
http://www.openldap.org/lists/openldap-bugs/201308/msg00043.html
Feel free to experiment with it and see whether it suits your need.
Why was this undocumented strictrefresh option limited to delta-syncrepl?
Just expedient at the time, it would take more code to cover other cases. Also it wasn't clear to me that this was actually a good approach, thus the need for other developers to test it and give feedback.
Hmm, it's some work and risk to change a serious setup to delta-syncrepl.
The biggest problem with this approach is that it has global impact - turning off slapd's listeners. On a slapd with multiple independent databases, this would be a bad idea.
IMO that's not really an issue because
- you likely have many replicas serving the same set of databases to clients and
- it's very likely that you're initializing all DBs at once because if
deploying a new server, recovering after file system corruption etc.
An obvious counter-example: a server with multiple databases, consuming from distinct providers. If the connection to one of those providers is lost and reconnected, the current code would disable slapd globally but only one of the multiple DBs needs to be blocked.
Anyway having an slapd-internal mechanism is way better than external work-arounds with monitoring contextCSN and using iptables etc.
Sure. But again, it requires more thinking, there are plenty of undesirable side-effects to any approach you can mention.
Howard Chu wrote:
Michael Ströder wrote:
Howard Chu wrote:
Michael Ströder wrote:
Howard Chu wrote:
Christian Kratzer wrote:
I remember a discussion some time ago about the possibility of delaying
access to a syncrepl. consumer during the intial DIT load.
http://www.openldap.org/lists/openldap-bugs/201308/msg00043.html
Feel free to experiment with it and see whether it suits your need.
Why was this undocumented strictrefresh option limited to delta-syncrepl?
Just expedient at the time, it would take more code to cover other cases. Also it wasn't clear to me that this was actually a good approach, thus the need for other developers to test it and give feedback.
Hmm, it's some work and risk to change a serious setup to delta-syncrepl.
The biggest problem with this approach is that it has global impact - turning off slapd's listeners. On a slapd with multiple independent databases, this would be a bad idea.
IMO that's not really an issue because
- you likely have many replicas serving the same set of databases to
clients and 2. it's very likely that you're initializing all DBs at once because if deploying a new server, recovering after file system corruption etc.
An obvious counter-example: a server with multiple databases, consuming from distinct providers. If the connection to one of those providers is lost and reconnected, the current code would disable slapd globally but only one of the multiple DBs needs to be blocked.
If you have HA requirements in such a replication setup you will set up another replica with such a configuration => see "set of databases" in item 1. above.
Anyway having an slapd-internal mechanism is way better than external work-arounds with monitoring contextCSN and using iptables etc.
Sure. But again, it requires more thinking, there are plenty of undesirable side-effects to any approach you can mention.
Thinking is always good.
Ciao, Michael.
--On Monday, March 24, 2014 4:28 PM +0100 Michael Ströder michael@stroeder.com wrote:
Howard Chu wrote:
Michael Ströder wrote:
Howard Chu wrote:
Christian Kratzer wrote:
I remember a discussion some time ago about the possibility of delaying
access to a syncrepl. consumer during the intial DIT load.
http://www.openldap.org/lists/openldap-bugs/201308/msg00043.html
Feel free to experiment with it and see whether it suits your need.
Why was this undocumented strictrefresh option limited to delta-syncrepl?
Just expedient at the time, it would take more code to cover other cases. Also it wasn't clear to me that this was actually a good approach, thus the need for other developers to test it and give feedback.
Hmm, it's some work and risk to change a serious setup to delta-syncrepl.
Serious setups use delta-syncrepl.
--Quanah
--
Quanah Gibson-Mount Architect - Server Zimbra, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
Michael Ströder wrote:
Howard Chu wrote:
Christian Kratzer wrote:
I remember a discussion some time ago about the possibility of delaying
access to a syncrepl. consumer during the intial DIT load.
http://www.openldap.org/lists/openldap-bugs/201308/msg00043.html
Feel free to experiment with it and see whether it suits your need.
Why was this undocumented strictrefresh option limited to delta-syncrepl?
Also some relevant discussion here http://www.openldap.org/lists/openldap-devel/200510/msg00083.html
The biggest danger is in a multi-master config where two nodes point at each other. If they are both started at the same time, and both deny searches while refresh is in progress, it's easy to get into a state where both servers are stuck waiting for each other's refresh to complete.
Hi,
On Mon, 24 Mar 2014, Michael Ströder wrote:
Howard Chu wrote:
Christian Kratzer wrote:
I remember a discussion some time ago about the possibility of delaying
access to a syncrepl. consumer during the intial DIT load.
http://www.openldap.org/lists/openldap-bugs/201308/msg00043.html
Feel free to experiment with it and see whether it suits your need.
Why was this undocumented strictrefresh option limited to delta-syncrepl?
ok Thanks for the pointer.
Thats why it is not working in my setup as this old horse still does regular syncrepl....
I see following in servers/slapd/syncrepl.c
1013 if (si->si_strict_refresh) { 1014 slap_suspend_listeners(); 1015 connections_drop(); 1016 }
and folloing turns listeners on later on:
1174 if ( err == LDAP_SUCCESS 1175 && si->si_logstate == SYNCLOG_FALLBACK ) { 1176 si->si_logstate = SYNCLOG_LOGGING; 1177 rc = LDAP_SYNC_REFRESH_REQUIRED; 1178 slap_resume_listeners(); 1179 } else { 1180 rc = -2; 1181 }
The logic seems to be that once we are synced up with one of the providers we should be set as the other providers should have identical data.
I can also see the next caveat that resetting all connections is possibly triggered each time a syncrepl connection is setup to a provider. Perhaps even when dumping connections on a consumer when a provider is restarted and the consumer reconnects.
I would really like such a feature for consistency but this seems to need more deep thinking.
Greetings Christian
Christian Kratzer wrote:
Hi,
On Mon, 24 Mar 2014, Michael Ströder wrote:
Howard Chu wrote:
Christian Kratzer wrote:
I remember a discussion some time ago about the possibility of delaying
access to a syncrepl. consumer during the intial DIT load.
http://www.openldap.org/lists/openldap-bugs/201308/msg00043.html
Feel free to experiment with it and see whether it suits your need.
Why was this undocumented strictrefresh option limited to delta-syncrepl?
ok Thanks for the pointer.
Thats why it is not working in my setup as this old horse still does regular syncrepl....
I see following in servers/slapd/syncrepl.c
1013 if (si->si_strict_refresh) { 1014 slap_suspend_listeners(); 1015 connections_drop(); 1016 }
and folloing turns listeners on later on:
1174 if ( err == LDAP_SUCCESS 1175 && si->si_logstate == SYNCLOG_FALLBACK ) { 1176 si->si_logstate = SYNCLOG_LOGGING; 1177 rc = LDAP_SYNC_REFRESH_REQUIRED; 1178 slap_resume_listeners(); 1179 } else { 1180 rc = -2; 1181 }
The logic seems to be that once we are synced up with one of the providers
we should be set as the other providers should have identical data.
I can also see the next caveat that resetting all connections is possibly
triggered each time a syncrepl connection is setup to a provider. Perhaps even when dumping connections on a consumer when a provider is restarted and the consumer reconnects.
I would really like such a feature for consistency but this seems to need more deep thinking.
Just posted a patch to ITS#7616. http://www.openldap.org/lists/openldap-bugs/201403/msg00044.html
Try that and see how it goes for you. It should prevent incoming consumer requests if the local database is empty. Otherwise it allows incoming consumers, to avoid the MMR deadlock situation. Aside from that it rejects incoming generic searches while a refresh is in progress.
Greetings Christian
openldap-technical@openldap.org