Hi all,
We have setup a couple of servers in N-way multimaster config, using back-config, as explained in the admin guide. These all use RE24, checked out today.
We are now trying to add another server to the existing cluster. To do this, we want to replicate the existing cn=config branch from the cluster, to initialize the config for the new server.
To do this, we start the new server with a minimal cn=config branch, making it a syncrepl consumer to an existing server (consumer only, no multimaster on this new server): 8<------------ dn: cn=config objectClass: olcGlobal cn: config olcServerID: 2 olcLogLevel: sync stats
dn: olcDatabase={0}config,cn=config objectClass: olcDatabaseConfig olcDatabase: {0}config olcRootPW: secret olcSyncRepl: rid=001 provider=ldap://server1/ binddn="cn=config" bindmethod=simple credentials=secret searchbase="cn=config" type=refreshAndPersist retry="5 5 300 5" timeout=3 8<------------
Then, we start the server with slapd -c "rid=001,csn=0" to force a full reload. This successfully loads the config branch from the master, *except* entries that already existed in the new server's config branch. These produce the following errors: 8<------------ Feb 3 18:22:43 server2 slapd[12893]: @(#) $OpenLDAP: slapd 2.4.X (Feb 3 2009 12:10:46) $ root@server2.test.lan:/root/sources/openldap-cvs-re24/servers/slapd Feb 3 18:22:43 server2 slapd[12894]: slap_queue_csn: queing 0x194e31f0 20090203172243.314729Z#000000#002#000000 Feb 3 18:22:43 server2 slapd[12894]: slap_graduate_commit_csn: removing 0x194e3930 20090203172243.314729Z#000000#002#000000 Feb 3 18:22:43 server2 slapd[12894]: slapd starting Feb 3 18:22:43 server2 slapd[12894]: syncrepl_entry: rid=001 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD) Feb 3 18:22:43 server2 slapd[12894]: syncrepl_entry: rid=001 inserted UUID 31cda500-29ca-4bf8-bc53-253af0021b21 Feb 3 18:22:43 server2 slapd[12894]: syncrepl_entry: rid=001 be_search (0) Feb 3 18:22:43 server2 slapd[12894]: syncrepl_entry: rid=001 cn=config Feb 3 18:22:43 server2 slapd[12894]: syncrepl_entry: rid=001 be_add (68) Feb 3 18:22:43 server2 slapd[12894]: dn_callback : new entry is older than ours cn=config ours 20090203172228.809275Z#000000#000#000000, new 20090203165238.611891Z#000000#001#000000 Feb 3 18:22:43 server2 slapd[12894]: syncrepl_entry: rid=001 entry unchanged, ignored (cn=config) 8<------------
Only four entries are in this case: - cn=config - cn=schema,cn=config - olcDatabase={-1}frontend,cn=config - olcDatabase={0}config,cn=config
How can we force syncrepl to overwrite these entries?
Any hints, or advice would be most appreciated.
Regards, Jonathan
--On Tuesday, February 03, 2009 6:32 PM +0100 Jonathan Clarke jclarke@linagora.com wrote:
Hi all,
We have setup a couple of servers in N-way multimaster config, using back-config, as explained in the admin guide. These all use RE24, checked out today.
We are now trying to add another server to the existing cluster. To do this, we want to replicate the existing cn=config branch from the cluster, to initialize the config for the new server.
To do this, we start the new server with a minimal cn=config branch, making it a syncrepl consumer to an existing server (consumer only, no multimaster on this new server):
How do you expect to replicate the cn=config branch from a multi-master and end up with only a replica? I'm lost. Once it finishes, it'll be a multi-master, not a pure replica.
If you already have a syncrepl replica, and are just wanting to set up a new one, you should slapcat the config tree from the existing replica and slapadd it to the new one before starting it. This will avoid all these problems. Otherwise, you need to come up with a new cn=config tree that is *not* replicated from one of the masters, which again avoids the issues you are seeing.
I.e., for new serves, either slapcat the config tree from one that matches the template you are creating and slapadd that to the new server, or come up with an entirely new config tree.
If you really want to have fun, set up another database on the master to store the cn=config tree for replicas under a different branch, and then use slapo-rwm to rewrite it as a config tree for any replica that connects to it. This should work in theory, although I've never done it.
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
Quanah Gibson-Mount wrote:
--On Tuesday, February 03, 2009 6:32 PM +0100 Jonathan Clarke jclarke@linagora.com wrote:
Hi all,
We have setup a couple of servers in N-way multimaster config, using back-config, as explained in the admin guide. These all use RE24, checked out today.
We are now trying to add another server to the existing cluster. To do this, we want to replicate the existing cn=config branch from the cluster, to initialize the config for the new server.
To do this, we start the new server with a minimal cn=config branch, making it a syncrepl consumer to an existing server (consumer only, no multimaster on this new server):
How do you expect to replicate the cn=config branch from a multi-master and end up with only a replica? I'm lost. Once it finishes, it'll be a multi-master, not a pure replica.
Absolutely - this is the aim, to integrate a new master server into the existing multi-master cluster. Sorry if I was not clear, our aim is not to set up a simple replica, but an extra full blown master (N+1 of a N-way multimaster setup).
We are basically attempting to industrialize adding a server to a cluster of existing multi-masters in a load-balancing setup. The idea is to save as much manual copying as possible, and enjoy the "magic" of multi-master replication coupled with cn=config :))
If you really want to have fun, set up another database on the master to store the cn=config tree for replicas under a different branch, and then use slapo-rwm to rewrite it as a config tree for any replica that connects to it. This should work in theory, although I've never done it.
In theory too (I have not tested since I'm away from the test machines right now), this would produce the same symptom as I described originally - the entries from the pseudo-cn=config branch on the master would be detected as older than the local, new cn=config entries, and be rejected.
My question could be put more broadly: how can you tell syncrepl that is really *just* a slave, and replace everything it has with content from the master, even if one of it's own entries is more recent according to the CSN? The current behavior is to keep the most recent modification, thus comprising the replica's integrity.
Typically, this is the case on a newly installed server, with a freshly slapadd-ed cn=config - CSN's will have a timestamp more recent than the other masters' config.
Thanks for your reply Quanah.
Regards, Jonathan
--On Tuesday, February 03, 2009 10:37 PM +0100 Jonathan Clarke jclarke@linagora.com wrote:
Q Absolutely - this is the aim, to integrate a new master server into the existing multi-master cluster. Sorry if I was not clear, our aim is not to set up a simple replica, but an extra full blown master (N+1 of a N-way multimaster setup).
Then you need to do the steps I outlined: slapcat one of your master's cn=config trees, and slapadd that onto the new master and start it. Then all the entries will have the right timestamps, and everything will replicate as appropriate.
We are basically attempting to industrialize adding a server to a cluster of existing multi-masters in a load-balancing setup. The idea is to save as much manual copying as possible, and enjoy the "magic" of multi-master replication coupled with cn=config :))
Use the above to correctly seed the new master's cn=config tree.
If you really want to have fun, set up another database on the master to store the cn=config tree for replicas under a different branch, and then use slapo-rwm to rewrite it as a config tree for any replica that connects to it. This should work in theory, although I've never done it.
In theory too (I have not tested since I'm away from the test machines right now), this would produce the same symptom as I described originally
- the entries from the pseudo-cn=config branch on the master would be
detected as older than the local, new cn=config entries, and be rejected.
No, it wouldn't. You're missing the overall point of what I'm talking about here. In this setup, you would have two config trees. One for the various masters, rooted at "cn=config" on the master servers. Then you would have one on the master servers(rooted, at, say "cn=config-replica") for the replica's cn=config tree. Then you would use slapo-rwm for the replicas to rewrite calls to the masters for cn=config to actually read from the cn=config-replica tree. Again, you would still want to do the initial seed on these replicas via slapcat/slapadd. That will always be your basic first step, and will fully avoid the issue you are hitting. This way, you can have multiple configuration trees for different types of servers.
Typically, this is the case on a newly installed server, with a freshly slapadd-ed cn=config - CSN's will have a timestamp more recent than the other masters' config.
No, the CSNs when you use a slapadded cn=config will *not* be more recent. They'll be preserved to be what they were when the slapcat was done. That's why it avoids this problem.
Thanks for your reply Quanah.
No problem! :)
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
Hi,
My question could be put more broadly: how can you tell syncrepl that is really *just* a slave, and replace everything it has with content from the master, even if one of it's own entries is more recent according to the CSN? The current behavior is to keep the most recent modification, thus comprising the replica's integrity.
Well, as said Jonathan, with an minimalistic slapadd (just few entries such as cn=config; cn=schema,cn=config; olcDatabase={0}config,cn=config and olcDatabase{-1}frontend,cn=config), all entire cn=config will be replicated except those four entries due to the CSN. The idea is to add a fictive CSN into the slapadd :
8<------------ dn: cn=config objectClass: olcGlobal cn: config olcServerID: 2 entryCSN: 20000101000000.000000Z#000000#001#000000 createTimestamp: 20000101000000Z modifyTimestamp: 20000101000000Z
dn: olcDatabase={0}config,cn=config objectClass: olcDatabaseConfig olcDatabase: {0}config olcRootPW: secret olcSyncRepl: rid=001 provider=ldap://server1/ binddn="cn=config" bindmethod=simple credentials=secret searchbase="cn=config" type=refreshAndPersist retry="5 5 300 5" timeout=3 entryCSN: 20000101000000.000000Z#000000#001#000000 createTimestamp: 20000101000000Z modifyTimestamp: 20000101000000Z
[...] 8<------------
In this case, all cn=config branch will be replicated, because of old CSN compared to the provider (server1). The replica will become a provider because of its reference in olcSyncRepl attribute on the primary provider (server1).
All work fine.
But, there are still problems. I saw one when the replica started, when it attempts to replicate the oldDatabase={0}config,cn=config, that force OpenLDAP to stop.
8<-------- Config: ** successfully added syncrepl "ldap://192.168.101.12/" ldif_read_file: read entry file: "/usr/local/openldap-2.4/etc/openldap/slapd.d/cn=config/olcDatabase={0}config.ldif" => str2entry: "dn: olcDatabase={0}config objectClass: olcDatabaseConfig olcDatabase: {0}config olcRootDN: cn=config olcRootPW:: c2VjcmV0 olcSyncrepl: {0}rid=001 provider=ldap://192.168.101.11/ binddn="cn=config" bindmethod=simple credentials=linagora searchbase="cn=config" type=refreshAndPersist retry="5 5 300 5" timeout=3 entryCSN: 20000101000000.000000Z#000000#001#000000 createTimestamp: 20000101000000Z modifyTimestamp: 20000101000000Z structuralObjectClass: olcDatabaseConfig entryUUID: d7843fc4-93b8-433d-85d2-cc398eb3ee2a creatorsName: cn=config modifiersName: cn=config "
dnPrettyNormal: <olcDatabase={0}config>
=> ldap_bv2dn(olcDatabase={0}config,0) <= ldap_bv2dn(olcDatabase={0}config)=0 => ldap_dn2bv(272) <= ldap_dn2bv(olcDatabase={0}config)=0 => ldap_dn2bv(272) <= ldap_dn2bv(olcDatabase={0}config)=0 <<< dnPrettyNormal: <olcDatabase={0}config>, <olcDatabase={0}config>
dnNormalize: <cn=config>
=> ldap_bv2dn(cn=config,0) <= ldap_bv2dn(cn=config)=0 => ldap_dn2bv(272) <= ldap_dn2bv(cn=config)=0 <<< dnNormalize: <cn=config>
dnNormalize: <cn=config>
=> ldap_bv2dn(cn=config,0) <= ldap_bv2dn(cn=config)=0 => ldap_dn2bv(272) <= ldap_dn2bv(cn=config)=0 <<< dnNormalize: <cn=config>
dnNormalize: <cn=config>
=> ldap_bv2dn(cn=config,0) <= ldap_bv2dn(cn=config)=0 => ldap_dn2bv(272) <= ldap_dn2bv(cn=config)=0 <<< dnNormalize: <cn=config> <= str2entry(olcDatabase={0}config) -> 0x2886a38 <= acl_access_allowed: granted to database root ldif_write_entry: wrote entry "olcDatabase={0}config,cn=config" send_ldap_result: conn=-1 op=0 p=0 send_ldap_result: err=0 matched="" text="" send_ldap_result: conn=-1 op=0 p=0 send_ldap_result: err=0 matched="" text="" ldap_msgfree slapd: result.c:112: ldap_result: Assertion `ld != ((void *)0)' failed. Abandon 8<--------
By restarting it, it seams that all datas has been replicated successfully and OpenLDAP will not stop anymore.
Any idea ? A bug ?
Cheers, Thomas.
openldap-software@openldap.org