Re: mmr pair stops replicating: "consumer state is newer than provider"

5 Jul 2017


      ...
wow, that's a mess.
So #000# is serverID 0, which would be for any entries prior to moving 
to MMR.  The fact that you have different values for #000# on dsa1 
accesslog vs the other 3 databases is disturbing.
It would appear DSA1 is serverID 1, and its CSNs make sense:
20170530214415.204052Z#000000#001#000000
20170530214415.204052Z#000000#001#000000
However, there's someting serious wrong with dsa2 (assuming it is 
serverID 2):
20170521175113.974560Z#000000#002#000000
20170619014933.531051Z#000000#002#000000
As this implies the primary DB received a write on 2017/06/19 @ 
01:49:33, but the accesslog has not recorded this change, as it says the 
last time there was a write op to the accesslog DB on #002# was 
2017/05/21 @ 17:51:13, nearly a month earlier.  So it doesn't seem to 
think you've done a write op directly against serverID 002.
thanks.  i think i've managed to clean up the mess, and replications is 
flowing again.  i've exorcized the old serverid 000 references, and 
verified each server's accesslog is getting updated as local 
modifications occur.
contextcsns seem to be a bit more sane now, hopefully?
...
ldapsearch -ZZxWLLLH 'ldap://dsa1.example.org/' -D
'uid=dit_admin,ou=role_accounts,ou=accounts,dc=example,dc=org' -b 
'cn=config' -s base 'olcserverid'
Enter LDAP Password:
dn: cn=config
olcServerID: 1
...
ldapsearch -ZZxWLLLH 'ldap://dsa2.example.org/' -D
'uid=dit_admin,ou=role_accounts,ou=accounts,dc=example,dc=org' -b 
'cn=config' -s base 'olcserverid'
Enter LDAP Password:
dn: cn=config
olcServerID: 2
...
ldapsearch -ZZxWLLLH 'ldap://dsa1.example.org/' -D
'uid=dit_admin,ou=role_accounts,ou=accounts,dc=example,dc=org' -b 
'dc=example,dc=org' -s base 'contextcsn'
Enter LDAP Password:
dn: dc=example,dc=org
contextCSN: 20170705042207.590054Z#000000#001#000000
contextCSN: 20170704183515.872465Z#000000#002#000000
...
ldapsearch -ZZxWLLLH 'ldap://dsa2.example.org/' -D
'uid=dit_admin,ou=role_accounts,ou=accounts,dc=example,dc=org' -b 
'dc=example,dc=org' -s base 'contextcsn'
Enter LDAP Password:
dn: dc=example,dc=org
contextCSN: 20170705042207.590054Z#000000#001#000000
contextCSN: 20170704183515.872465Z#000000#002#000000
...
ldapsearch -ZZxWLLLH 'ldap://dsa1.example.org/' -D
'uid=dit_admin,ou=role_accounts,ou=accounts,dc=example,dc=org' -b 
'cn=accesslog' -s base 'contextcsn'
Enter LDAP Password:
dn: cn=accesslog
contextCSN: 20170705042145.957972Z#000000#001#000000
contextCSN: 20170704183515.872465Z#000000#002#000000
...
ldapsearch -ZZxWLLLH 'ldap://dsa2.example.org/' -D
'uid=dit_admin,ou=role_accounts,ou=accounts,dc=example,dc=org' -b 
'cn=accesslog' -s base 'contextcsn'
Enter LDAP Password:
dn: cn=accesslog
contextCSN: 20170705042145.957972Z#000000#001#000000
contextCSN: 20170704183515.872465Z#000000#002#000000
i've also increased accesslog data retention from 7 days to 14 days, as 
a bit of a compensation for the infrequent writes, and i'll implement a 
"no-op" cron job as well, as a fail safe.  are then any pitfalls i may 
not be considering with a 14 day accesslog retention period?  is that 
too long according to "typical" consensus?
for posterity's sake, after the mess was cleaned up, once a proper write 
occurred on each master, and the accesslog db was updated and csns 
brought in line, replication began flowing again, without the need for a 
restart on either side [at least in this particular case, anyway].
-ben

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

Re: mmr pair stops replicating: "consumer state is newer than provider"