New subject: mmr pair stops replicating: "consumer state is newer than provider"

29 Jun 2017


      --On Thursday, June 29, 2017 1:41 PM -0400 btb btb@bitrate.net wrote:
...
On 6/29/17 11:15 AM, Quanah Gibson-Mount wrote:
...
--On Thursday, June 29, 2017 2:12 AM -0400 btb btb@bitrate.net wrote:
...
i see, thanks.  i tested this, and did a modify on each, but didn't see
replication resume.  emulating the syncrepl connection with a manual
search against each master, there do seem to be accesslog entries now,
on both masters:
You may have to restart the consumers (I did when I ran into this).
i did try a restart on both, but they returned to the same state
...
Also, there are 2 sets of CSNs per master that you need to examine --
The CSNs in your database root (i.e., dc=example,dc=org) and your
accesslog root.
that would be these, right?
dsa1 cn=accesslog:
20161019002438.652359Z#000000#000#000000
20170521175113.974560Z#000000#002#000000
20170530214415.204052Z#000000#001#000000
dsa1 dc=example,dc=org:
20170520031415.276678Z#000000#000#000000
20170530214231.171959Z#000000#002#000000
20170530214415.204052Z#000000#001#000000
dsa2 cn=accesslog:
20170520031415.276678Z#000000#000#000000
20170521175113.974560Z#000000#002#000000
20170628034119.327974Z#000000#001#000000
dsa2 dc=example,dc=org:
20170520031415.276678Z#000000#000#000000
20170619014933.531051Z#000000#002#000000
20170628034119.327974Z#000000#001#000000
why are there three per db, and which is suppose to match which?
wow, that's a mess.
So #000# is serverID 0, which would be for any entries prior to moving to 
MMR.  The fact that you have different values for #000# on dsa1 accesslog 
vs the other 3 databases is disturbing.
It would appear DSA1 is serverID 1, and its CSNs make sense:
20170530214415.204052Z#000000#001#000000
20170530214415.204052Z#000000#001#000000
However, there's someting serious wrong with dsa2 (assuming it is serverID 
2):
20170521175113.974560Z#000000#002#000000
20170619014933.531051Z#000000#002#000000
As this implies the primary DB received a write on 2017/06/19 @ 01:49:33, 
but the accesslog has not recorded this change, as it says the last time 
there was a write op to the accesslog DB on #002# was 2017/05/21 @ 
17:51:13, nearly a month earlier.  So it doesn't seem to think you've done 
a write op directly against serverID 002.
--Quanah
--
Quanah Gibson-Mount
Product Architect
Symas Corporation
Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
http://www.symas.com

Re: mmr pair stops replicating: "consumer state is newer than provider"