Hello guys, I have a problem with delta-syn replication (all set up according to 'official' guide - http://www.openldap.org/doc/admin24/replication.html#Delta-syncrepl I have master instance with logs 'shipped' to a client - it all works fine as long as connection is good. Getting ready to move into production I'm trying to emulate connectivity problems and here where I got problems.
Specifically - even though I have mirror instance set up as: syncrepl rid=101 provider=ldap://192.168.22.62:389 type=refreshAndPersist bindmethod=simple binddn="cn=replicator,xxxxx" credentials="xxxxxx" searchbase="xxxxxxx" filter="(objectClass=*)" logbase="cn=accesslog" logfilter="(&(objectClass=auditWriteObject)(reqResult=0))" scope=sub attrs="*,+" schemachecking=off * retry="1 +"* syncdata=accesslog
once I have server disconnected (I sumply restart slapd on master), the client not even tries to re-connect, the log below shows modificatin operation at 18:34:18 that went fine and 11 seconds later I restart master's ldap service (which became immediately available again):
Jul 28 18:34:18 newton slapd[20353]: => entry_encode(0x00000032): mail=xxxxxxxxxxxxxxxxxxxxxxxxxx. Jul 28 18:34:18 newton slapd[20353]: bdb_modify: updated id=00000032 dn="yyyyyyyyyyyyyyyyyyyyyyyy" Jul 28 18:34:18 newton slapd[20353]: send_ldap_result: conn=-1 op=0 p=0 Jul 28 18:34:18 newton slapd[20353]: send_ldap_result: err=0 matched="" text="" Jul 28 18:34:18 newton slapd[20353]: syncrepl_entry: rid 101 be_modify (0) Jul 28 18:34:18 newton slapd[20353]: bdb_modify: xxxxxxxxxxxxxxxxxx. Jul 28 18:34:18 newton slapd[20353]: bdb_dn2entry("oxxxxxxxxxxxxxxx") Jul 28 18:34:18 newton slapd[20353]: bdb_modify_internal: 0x00000001: o=xxxxxxxxxxxxxxxxx. Jul 28 18:34:18 newton slapd[20353]: <= acl_access_allowed: granted to database root Jul 28 18:34:18 newton slapd[20353]: bdb_modify_internal: replace contextCSN Jul 28 18:34:18 newton slapd[20353]: => entry_encode(0x00000001): o=xxxxxxxxxxxxxxxxxxxxx. Jul 28 18:34:18 newton slapd[20353]: bdb_modify: updated id=00000001 dn="xxxxxxxxxxxxxxxx" Jul 28 18:34:18 newton slapd[20353]: send_ldap_result: conn=-1 op=0 p=0 Jul 28 18:34:18 newton slapd[20353]: send_ldap_result: err=0 matched="" text="" Jul 28 18:34:18 newton slapd[20353]: daemon: activity on 1 descriptor Jul 28 18:34:18 newton slapd[20353]: daemon: activity on: Jul 28 18:34:18 newton slapd[20353]: Jul 28 18:34:18 newton slapd[20353]: daemon: epoll: listen=7 active_threads=0 tvp=NULL Jul 28 18:34:29 newton slapd[20353]: daemon: activity on 1 descriptor Jul 28 18:34:29 newton slapd[20353]: daemon: activity on: Jul 28 18:34:29 newton slapd[20353]: 14r Jul 28 18:34:29 newton slapd[20353]: Jul 28 18:34:29 newton slapd[20353]: daemon: read active on 14 Jul 28 18:34:29 newton slapd[20353]: daemon: epoll: listen=7 active_threads=0 tvp=NULL Jul 28 18:34:29 newton slapd[20353]: connection_get(14) Jul 28 18:34:29 newton slapd[20353]: connection_get(14): got connid=0 Jul 28 18:34:29 newton slapd[20353]: =>do_syncrepl rid 101 Jul 28 18:34:29 newton slapd[20353]: =>do_syncrep2 rid 101 Jul 28 18:34:29 newton slapd[20353]: do_syncrep2: rid 101 Can't contact LDAP server Jul 28 18:34:29 newton slapd[20353]: connection_get(14) Jul 28 18:34:29 newton slapd[20353]: connection_get(14): got connid=0 Jul 28 18:34:29 newton slapd[20353]: daemon: removing 14 Jul 28 18:34:29 newton slapd[20353]: daemon: activity on 1 descriptor Jul 28 18:34:29 newton slapd[20353]: daemon: activity on: Jul 28 18:34:29 newton slapd[20353]: Jul 28 18:34:29 newton slapd[20353]: daemon: epoll: listen=7 active_threads=0 tvp=NULL Jul 28 18:34:29 newton slapd[20353]: do_syncrepl: rid 101 quitting
I'm running openldap 2.3.43-12.el5_5.1 from standard CentOS 5.4 installation.
Do I get something wrong and slave not supposed to re-connect after master service restart or is this some kind of a problem that was fixed in later versions?
Thank you, Alex
Dear Alex,
On 28/07/10 18:57 -0400, Alexander Ivanov wrote:
Hello guys, I have a problem with delta-syn replication (all set up according to 'official' guide - http://www.openldap.org/doc/admin24/replication.html#Delta-syncrepl I have master instance with logs 'shipped' to a client - it all works fine as long as connection is good. Getting ready to move into production I'm trying to emulate connectivity problems and here where I got problems.
[snip]
once I have server disconnected (I sumply restart slapd on master), the client not even tries to re-connect, the log below shows modificatin operation at 18:34:18 that went fine and 11 seconds later I restart master's ldap service (which became immediately available again):
I am having the same trouble, but with ordinary syncrepl. As soon as the master is restarted, the slaves all quit their syncrepl threads, and never start again:
Aug 12 08:58:00 ldapro04 slapd[9166]: do_syncrep2: rid 003 Can't contact LDAP server Aug 12 08:58:00 ldapro04 slapd[9166]: do_syncrepl: rid 003 quitting
This is a serious barrier to deployment in a busy production environment with many slaves.
Jul 28 18:34:29 newton slapd[20353]: do_syncrepl: rid 101 quitting
I'm running openldap 2.3.43-12.el5_5.1 from standard CentOS 5.4 installation.
I am running the same openldap as you, on CentOS 5.5.
Do I get something wrong and slave not supposed to re-connect after master service restart or is this some kind of a problem that was fixed in later versions?
I have exactly the same question. I don't think Alex and I are the only ones with this situation.
slapd.conf on provider: =======================
# slapd.conf generated by /usr/bin/conform
include /etc/openldap/schema/core.schema include /etc/openldap/schema/cosine.schema include /etc/openldap/schema/inetorgperson.schema include /etc/openldap/schema/nis.schema include /etc/openldap/schema/local.schema
loglevel stats sync allow bind_v2 pidfile /var/run/openldap/slapd.pid argsfile /var/run/openldap/slapd.args tool-threads 4 modulepath /usr/lib64/openldap
############################################################ # GLOBAL database definition ############################################################
access to dn.base="" by * read
access to dn.base="cn=Subschema" by * read
############################################################ # ou=tree,ou=name database definition ############################################################
database bdb suffix "ou=tree,ou=name" rootdn cn=manager,ou=tree,ou=name rootpw root-password directory /var/lib/ldap/ou=tree,ou=name index entryCSN eq index entryUUID eq index objectClass eq index uid eq index username eq
cachesize 1000000 idlcachesize 1000000 checkpoint 65536 240 idletimeout 300 writetimeout 90000 limits dn.base=cn=syncrepl,ou=tree,ou=name size.soft=unlimited size.hard=unlimited time.soft=unlimited time.hard=unlimited
access to dn.subtree="ou=tree,ou=name" by dn="cn=syncrepl,ou=tree,ou=name" read by peername.ip=227.137.34.172 read by peername.ip=209.146.228.56 read by peername.ip=147.107.14.11 read by peername.ip=127.0.0.1 read by * none break
access to dn.subtree="ou=tree,ou=name" attrs=userPassword by anonymous auth by * none break
overlay syncprov checkpoint 1000 5 sessionlog 100000
slapd.conf on consumer: =======================
# slapd.conf generated by /usr/bin/conform
include /etc/openldap/schema/core.schema include /etc/openldap/schema/cosine.schema include /etc/openldap/schema/inetorgperson.schema include /etc/openldap/schema/nis.schema include /etc/openldap/schema/local.schema
loglevel stats sync allow bind_v2 pidfile /var/run/openldap/slapd.pid argsfile /var/run/openldap/slapd.args tool-threads 8
############################################################ # GLOBAL database definition ############################################################
access to dn.base="" by * read
access to dn.base="cn=Subschema" by * read
############################################################ # ou=tree,ou=name database definition ############################################################
database bdb suffix "ou=tree,ou=name" rootdn cn=manager,ou=tree,ou=name rootpw root-password directory /var/lib/ldap/ou=tree,ou=name index entryCSN eq index entryUUID eq index objectClass eq index uid eq index username eq
cachesize 1000000 idlcachesize 1000000 checkpoint 65536 240 idletimeout 300 writetimeout 90000
access to dn.subtree="ou=tree,ou=name" by peername.ip=49.66.187.43 read by peername.ip=139.243.36.117 read by peername.ip=115.165.210.17 read by peername.ip=25.79.141.72%255.255.255.0 read by peername.ip=127.0.0.1 read by * none break
access to dn.base="ou=tree,ou=name" by peername.ip=238.118.197.179 read by * none break
access to dn.subtree="ou=tree,ou=name" attrs=userPassword by anonymous auth by * none break
syncrepl rid=003 provider=ldap://master:389 type=refreshAndPersist bindmethod=simple binddn="cn=syncrepl,ou=tree,ou=name credentials=syncrepl-password searchbase="ou=tree,ou=name"
If you see any problems with these configuration files, please let me know, even if they do not relate to the problem of syncrepl terminating after master is restarted.
I will send further information if that would help; please let me know what would cast light on this.
syncrepl rid=003 provider=ldap://master:389 type=refreshAndPersist bindmethod=simple binddn="cn=syncrepl,ou=tree,ou=name credentials=syncrepl-password searchbase="ou=tree,ou=name"
There is no "retry" here. See slapd.conf(5) and the admin guide for indications about how syncrepl should be configured.
p.
Nick Urbanik wrote:
Dear Alex,
On 28/07/10 18:57 -0400, Alexander Ivanov wrote:
Hello guys, I have a problem with delta-syn replication (all set up according to 'official' guide
I'm running openldap 2.3.43-12.el5_5.1 from standard CentOS 5.4 installation.
I am running the same openldap as you, on CentOS 5.5.
It's generally a mistake to read the docs for a different version of the software than you're actually running.
I have master instance with logs 'shipped' to a client - it all works fine as long as connection is good. Getting ready to move into production I'm trying to emulate connectivity problems and here where I got problems.
[snip]
once I have server disconnected (I sumply restart slapd on master), the client not even tries to re-connect, the log below shows modificatin operation at 18:34:18 that went fine and 11 seconds later I restart master's ldap service (which became immediately available again):
I am having the same trouble, but with ordinary syncrepl. As soon as the master is restarted, the slaves all quit their syncrepl threads, and never start again:
syncrepl rid=003 provider=ldap://master:389 type=refreshAndPersist bindmethod=simple binddn="cn=syncrepl,ou=tree,ou=name credentials=syncrepl-password searchbase="ou=tree,ou=name"
If you see any problems with these configuration files, please let me know, even if they do not relate to the problem of syncrepl terminating after master is restarted.
You have no "retry" parameter in your syncrepl config, so naturally it does not retry. It always helps to actually Read The correct FM, slapd.conf(5) in your case.
You have no "retry" parameter in your syncrepl config, so naturally it does not retry. It always helps to actually Read The correct FM, slapd.conf(5) in your case.
I'd also note that slapd will issue
syncrepl rid=003 searchbase="ou=tree,ou=name": no retry defined, using default
if no retry is configured; one should at least wonder what that message means. I'd favor refusing to start if no retry is configured, since replication is not reliable without.
p.
masarati@aero.polimi.it wrote:
You have no "retry" parameter in your syncrepl config, so naturally it does not retry. It always helps to actually Read The correct FM, slapd.conf(5) in your case.
I'd also note that slapd will issue
syncrepl rid=003 searchbase="ou=tree,ou=name": no retry defined, using default
if no retry is configured; one should at least wonder what that message means. I'd favor refusing to start if no retry is configured, since replication is not reliable without.
That message was added in 2.4, these guys are using 2.3. At this point I've grown tired of telling people "you're using an obsolete release, you should upgrade."
Dear Masarati,
On 13/08/10 02:44 +0200, masarati@aero.polimi.it wrote:
You have no "retry" parameter in your syncrepl config, so naturally it does not retry. It always helps to actually Read The correct FM, slapd.conf(5) in your case.
Bless you, thank you very much for that help.
I'd also note that slapd will issue
syncrepl rid=003 searchbase="ou=tree,ou=name": no retry defined, using default
if no retry is configured; one should at least wonder what that message means. I'd favor refusing to start if no retry is configured, since replication is not reliable without.
Yes, that makes sense.
[root@ldapro04.syd ~]# grep -P '\bretry' /var/log/ldap* [root@ldapro04.syd ~]#
No such error message seems to be present.
Dear Howard,
On 12/08/10 17:34 -0700, Howard Chu wrote:
You have no "retry" parameter in your syncrepl config, so naturally it does not retry. It always helps to actually Read The correct FM, slapd.conf(5) in your case.
Thank you very much indeed for your very helpful, prompt and accurate reply! I will happily buy you a beer or beverage of your choice if I see you at Linuxconf or elsewhere.
openldap-technical@openldap.org