Hi,
We have a multi master (2-node) cluster running 2.4.23 on CentOS 6. We're
effectively using them as a failover active-standby pair.
The 'Master' node failed last night and we failed over to the standby
(they're behind a load balancer). I am now trying to bring the old 'Master'
back online but it has become apparent there was a misconfiguration in the
server id config.
We did have 'Master' = serverid 1 and 'slave' = serverid 2 - i.e. it was
missing the servers URI. I have now fixed this, but we have around 500
objects on the old master reporting " changed by peer, ignored in the sync
log.
The old master will get up to the latest CSN number after I restart it, but
then get stuck with these " changed by peer, ignored" errors.
My question is, how do I get past this? Is it possible to remove the
objects and if so how (I don't want to delete them totally, just remove the
conflict).
Or, do I need to rebuild the 'old' master server database? If so, is the
process to stop slapd, remove the content of the database and accesslog
directories. Create an ldif export on the live server, slapadd that file
back on to the 'old' master, start it and then allow it to replicate any
new changes from it's partner?
If this is the only way to do it, is there anything I need to look out for?
If not this, then what do I do? I've looked but can't find any guidelines
in how to recover a failed node.
Any help appreciated!
Thanks,
Rich