--On Thursday, September 21, 2017 9:59 PM -0700 "Paul B. Henson" henson@acm.org wrote:
It seems there are updates for that group coming from rid 002 (egeria.ldap.cpp.edu) and 003 (minerva.ldap.cpp.edu), but none from rid 001 (themis.ldap.cpp.edu) which is serverid 4, where the change was actually made?
Oh, I thought you had said you only had two masters. This could well be ITS#8444 (ignore the ITS title, it has nothing to do with memberOf), where there are out of sync problems with 3+ MMR nodes and delta-syncrepl when syncprov checkpoints.
--Quanah
--
Quanah Gibson-Mount Product Architect Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: http://www.symas.com
On Fri, Sep 22, 2017 at 08:50:38AM -0700, Quanah Gibson-Mount wrote:
Oh, I thought you had said you only had two masters. This could well be
Ah, my bad, there are a total of 4 nodes, and while technically I guess they could all be "masters", only two of them ever receive writes, one is the primary behind a hardware load balancer and the other is the secondary; so in my head I have two masters and two read only systems. Which I suppose isn't really accurate from an openldap architecture perspective, sorry.
ITS#8444 (ignore the ITS title, it has nothing to do with memberOf), where there are out of sync problems with 3+ MMR nodes and delta-syncrepl when syncprov checkpoints.
Oh, I remember that ITS, I thought I'd fixed that issue by getting rid of the memberOf overlay and switching to dynlist 8-/. It seems since I stopped paying attention to it it's moved on in other directions.
I see there was a proposed patch posted on 8/25 that's been applied to RE24, I'll add that to my system and see if the issue goes away. Am I correct in my assumption that the patch only needs to be applied to the system that is receiving the updates?
Thanks...
On Fri, Sep 22, 2017 at 09:15:56PM -0700, Paul B. Henson wrote:
On Fri, Sep 22, 2017 at 08:50:38AM -0700, Quanah Gibson-Mount wrote:
Oh, I thought you had said you only had two masters. This could well be
Ah, my bad, there are a total of 4 nodes, and while technically I guess they could all be "masters", only two of them ever receive writes, one is the primary behind a hardware load balancer and the other is the secondary; so in my head I have two masters and two read only systems. Which I suppose isn't really accurate from an openldap architecture perspective, sorry.
ITS#8444 (ignore the ITS title, it has nothing to do with memberOf), where there are out of sync problems with 3+ MMR nodes and delta-syncrepl when syncprov checkpoints.
Oh, I remember that ITS, I thought I'd fixed that issue by getting rid of the memberOf overlay and switching to dynlist 8-/. It seems since I stopped paying attention to it it's moved on in other directions.
I see there was a proposed patch posted on 8/25 that's been applied to RE24, I'll add that to my system and see if the issue goes away. Am I correct in my assumption that the patch only needs to be applied to the system that is receiving the updates?
I'd apply it everywhere you have syncprov configured, these could send a cookie with too little information for a replica to spot and skip a duplicate.
Thanks to Ondrej for persisting with getting to the bottom of this rather annoying bug :-)
Unfortunately I won't be at LDAPCon this year otherwise I'd buy you a beer!
On 25/09/2017 15:31, Ondřej Kuzník wrote:
On Fri, Sep 22, 2017 at 09:15:56PM -0700, Paul B. Henson wrote:
On Fri, Sep 22, 2017 at 08:50:38AM -0700, Quanah Gibson-Mount wrote:
Oh, I thought you had said you only had two masters. This could well be
Ah, my bad, there are a total of 4 nodes, and while technically I guess they could all be "masters", only two of them ever receive writes, one is the primary behind a hardware load balancer and the other is the secondary; so in my head I have two masters and two read only systems. Which I suppose isn't really accurate from an openldap architecture perspective, sorry.
ITS#8444 (ignore the ITS title, it has nothing to do with memberOf), where there are out of sync problems with 3+ MMR nodes and delta-syncrepl when syncprov checkpoints.
Oh, I remember that ITS, I thought I'd fixed that issue by getting rid of the memberOf overlay and switching to dynlist 8-/. It seems since I stopped paying attention to it it's moved on in other directions.
I see there was a proposed patch posted on 8/25 that's been applied to RE24, I'll add that to my system and see if the issue goes away. Am I correct in my assumption that the patch only needs to be applied to the system that is receiving the updates?
I'd apply it everywhere you have syncprov configured, these could send a cookie with too little information for a replica to spot and skip a duplicate.
On Mon, Sep 25, 2017 at 04:31:40PM +0200, Ondřej Kuzník wrote:
I'd apply it everywhere you have syncprov configured, these could send a cookie with too little information for a replica to spot and skip a duplicate.
Hmm, I applied the patch to all four of my servers but I'm still seeing the errors :(...
Oct 2 03:46:24 egeria slapd[86715]: null_callback : error code 0x14 Oct 2 03:46:24 egeria slapd[86715]: syncrepl_message_to_op: rid=002 be_modify uid=nnharpale,ou=user,dc=cpp,dc=edu (20) Oct 2 03:54:59 egeria slapd[86715]: null_callback : error code 0x14 Oct 2 03:54:59 egeria slapd[86715]: syncrepl_message_to_op: rid=003 be_modify uid=lvl_1_users,ou=group,dc=cpp,dc=edu (20) Oct 2 03:55:00 egeria slapd[86715]: null_callback : error code 0x14 Oct 2 03:55:00 egeria slapd[86715]: syncrepl_message_to_op: rid=003 be_modify uid=lvl_1_users,ou=group,dc=cpp,dc=edu (20) Oct 2 03:55:00 egeria slapd[86715]: null_callback : error code 0x14 Oct 2 03:55:00 egeria slapd[86715]: syncrepl_message_to_op: rid=003 be_modify uid=lvl_1_users,ou=group,dc=cpp,dc=edu (20) Oct 2 03:55:00 egeria slapd[86715]: null_callback : error code 0x14 Oct 2 03:55:00 egeria slapd[86715]: syncrepl_entry: rid=003 be_modify failed (20) Oct 2 03:55:00 egeria slapd[86715]: do_syncrepl: rid=003 rc 20 retrying (9 retries left)
Oct 2 03:46:14 minerva slapd[68720]: null_callback : error code 0x14 Oct 2 03:46:14 minerva slapd[68720]: syncrepl_message_to_op: rid=002 be_modify uid=nicknguyen,ou=user,dc=cpp,dc=edu (20) Oct 2 03:55:00 minerva slapd[68720]: null_callback : error code 0x14 Oct 2 03:55:00 minerva slapd[68720]: syncrepl_message_to_op: rid=003 be_modify uid=lvl_1_users,ou=group,dc=cpp,dc=edu (20) Oct 2 03:55:00 minerva slapd[68720]: null_callback : error code 0x14 Oct 2 03:55:00 minerva slapd[68720]: syncrepl_message_to_op: rid=003 be_modify uid=lvl_1_users,ou=group,dc=cpp,dc=edu (20)
Any other thoughts on what might be going on or what would be helpful to debug it?
Thanks...
openldap-technical@openldap.org