I have an odd issue, where I have some new slaves that I added to my pool, they ran fine sync'ing with the master, etc. for weeks, until I added them into use (we're using an F5 frontend) whereupon they started getting a contextCSN which was larger than that of the master.
How is the contextCSN generated, I thought the slave could never get ahead of the master, but it's happening consistently, it's not a transient thing, the slave gets a larger contextCSN and keeps it, until the master is updated again.
master: contextCSN: 20121010154339.775633Z#000000#000#000000 slave1: contextCSN: 20121010154442.858054Z#000000#000#000000 slave2: contextCSN: 20121010154351.807575Z#000000#000#000000
As soon as I remove them from use, and update the master, they come back into sync.
Thanks in advance.
SJ
--On Wednesday, October 10, 2012 11:07 AM -0500 Sven Jourgensen svenj@uic.edu wrote:
I have an odd issue, where I have some new slaves that I added to my pool, they ran fine sync'ing with the master, etc. for weeks, until I added them into use (we're using an F5 frontend) whereupon they started getting a contextCSN which was larger than that of the master.
How is the contextCSN generated, I thought the slave could never get ahead of the master, but it's happening consistently, it's not a transient thing, the slave gets a larger contextCSN and keeps it, until the master is updated again.
master: contextCSN: 20121010154339.775633Z#000000#000#000000 slave1: contextCSN: 20121010154442.858054Z#000000#000#000000 slave2: contextCSN: 20121010154351.807575Z#000000#000#000000
As soon as I remove them from use, and update the master, they come back into sync.
OpenLDAP version? Configs for master and replicas? Have you verified that the clocks on all 3 servers are in sync?
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
The version is 2.4.23, the config is the same across all the slaves, including the ones that stay in sync under load, the hardware itself is more than capable, I'm curious however to your clock comment, would an unstable clock on the slave cause the contextCSN token to be out of whack?
Hence me asking rather how the contextCSN is generated and passed from master to slave, rather than, here's all my configs fix it for me.
SJ
On Wed, 10 Oct 2012, Quanah Gibson-Mount wrote:
--On Wednesday, October 10, 2012 11:07 AM -0500 Sven Jourgensen svenj@uic.edu wrote:
I have an odd issue, where I have some new slaves that I added to my pool, they ran fine sync'ing with the master, etc. for weeks, until I added them into use (we're using an F5 frontend) whereupon they started getting a contextCSN which was larger than that of the master.
How is the contextCSN generated, I thought the slave could never get ahead of the master, but it's happening consistently, it's not a transient thing, the slave gets a larger contextCSN and keeps it, until the master is updated again.
master: contextCSN: 20121010154339.775633Z#000000#000#000000 slave1: contextCSN: 20121010154442.858054Z#000000#000#000000 slave2: contextCSN: 20121010154351.807575Z#000000#000#000000
As soon as I remove them from use, and update the master, they come back into sync.
OpenLDAP version? Configs for master and replicas? Have you verified that the clocks on all 3 servers are in sync?
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc.
Zimbra :: the leader in open source messaging and collaboration
--On Wednesday, October 10, 2012 1:52 PM -0500 Sven Jourgensen svenj@uic.edu wrote:
The version is 2.4.23, the config is the same across all the slaves, including the ones that stay in sync under load, the hardware itself is more than capable, I'm curious however to your clock comment, would an unstable clock on the slave cause the contextCSN token to be out of whack?
Hence me asking rather how the contextCSN is generated and passed from master to slave, rather than, here's all my configs fix it for me.
It is required that the clock be in sync on all servers. This is well documented. You may also wish to read http://www.openldap.org/lists/openldap-technical/201109/msg00047.html
I would also note your version of OpenLDAP is extremely old, and has numerous known issues with sync replication that have since been fixed. See https://www.openldap.org/software/release/changes.html for more details.
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
I had the exact same problem, although my issues were tied closely to the usage of the memberOf overlay, along with delta-sync replication. The current Ubuntu LTS ( and Debian Wheezy) have OpenLDAP 2.4.28 which helped quite a bit (also not using delta-sync replication anymore).
-Yuri
On Wed, Oct 10, 2012 at 11:52 AM, Sven Jourgensen svenj@uic.edu wrote:
The version is 2.4.23, the config is the same across all the slaves, including the ones that stay in sync under load, the hardware itself is more than capable, I'm curious however to your clock comment, would an unstable clock on the slave cause the contextCSN token to be out of whack?
Hence me asking rather how the contextCSN is generated and passed from master to slave, rather than, here's all my configs fix it for me.
SJ
On Wed, 10 Oct 2012, Quanah Gibson-Mount wrote:
--On Wednesday, October 10, 2012 11:07 AM -0500 Sven Jourgensen svenj@uic.edu wrote:
I have an odd issue, where I have some new slaves that I added to my pool, they ran fine sync'ing with the master, etc. for weeks, until I added them into use (we're using an F5 frontend) whereupon they started getting a contextCSN which was larger than that of the master.
How is the contextCSN generated, I thought the slave could never get ahead of the master, but it's happening consistently, it's not a transient thing, the slave gets a larger contextCSN and keeps it, until the master is updated again.
master: contextCSN: 20121010154339.775633Z#000000#000#000000 slave1: contextCSN: 20121010154442.858054Z#000000#000#000000 slave2: contextCSN: 20121010154351.807575Z#000000#000#000000
As soon as I remove them from use, and update the master, they come back into sync.
OpenLDAP version? Configs for master and replicas? Have you verified that the clocks on all 3 servers are in sync?
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc.
Zimbra :: the leader in open source messaging and collaboration
openldap-technical@openldap.org