Hi Quanah,
It certainly sounds very similar from the description. I've applied the patch on our Dev systems and tried to recreate the crash but this time they don't crash and the change appears to make it to all 4 servers.
Based on that I'm happy to close it as a duplicate.
For completeness here's the log snippet from one of the 2 servers not to get the change directly:
2018-08-13T15:51:41.309019+01:00 bonsai.authorise-dev.is.ed.ac.uk slapd[33415]: conn=2891 op=1 SEARCH RESULT tag=101 err=0 nentries=0 text= 2018-08-13T15:51:41.310958+01:00 bonsai.authorise-dev.is.ed.ac.uk slapd[33415]: conn=2891 op=2 UNBIND 2018-08-13T15:51:41.311320+01:00 bonsai.authorise-dev.is.ed.ac.uk slapd[33415]: conn=2891 fd=17 closed 2018-08-13T15:52:18.347529+01:00 bonsai.authorise-dev.is.ed.ac.uk slapd[33415]: do_syncrep2: rid=003 cookie=rid=003,sid=00a,csn=20180810152745.105428Z#000000#009#000000 2018-08-13T15:52:18.347925+01:00 bonsai.authorise-dev.is.ed.ac.uk slapd[33415]: do_syncrep2: rid=003 CSN too old, ignoring 20180810152745.105428Z#000000#009#000000 (reqStart=20180810152801.000001Z,cn=accesslog) 2018-08-13T15:52:19.361612+01:00 bonsai.authorise-dev.is.ed.ac.uk slapd[33415]: do_syncrep2: rid=004 cookie=rid=004,sid=00c,csn=20180813145040.294725Z#000000#00c#000000 2018-08-13T15:52:19.361936+01:00 bonsai.authorise-dev.is.ed.ac.uk slapd[33415]: slap_queue_csn: queueing 0x4279380 20180813145040.294725Z#000000#00c#000000 2018-08-13T15:52:19.362187+01:00 bonsai.authorise-dev.is.ed.ac.uk slapd[33415]: syncrepl_message_to_op: rid=004 tid 6f1f6700 2018-08-13T15:52:19.378677+01:00 bonsai.authorise-dev.is.ed.ac.uk slapd[33415]: slap_queue_csn: queueing 0x3e067c0 20180813145040.294725Z#000000#00c#000000 2018-08-13T15:52:19.380316+01:00 bonsai.authorise-dev.is.ed.ac.uk slapd[33415]: slap_graduate_commit_csn: removing 0x3e067c0 20180813145040.294725Z#000000#00c#000000 2018-08-13T15:52:19.380570+01:00 bonsai.authorise-dev.is.ed.ac.uk slapd[33415]: slap_graduate_commit_csn: removing 0x4279380 20180813145040.294725Z#000000#00c#000000 2018-08-13T15:52:19.380819+01:00 bonsai.authorise-dev.is.ed.ac.uk slapd[33415]: syncrepl_message_to_op: rid=004 be_modify uid=jaw,ou=people,ou=central,dc=authorise-dev,dc=ed,dc=ac,dc=uk (0) 2018-08-13T15:52:19.381044+01:00 bonsai.authorise-dev.is.ed.ac.uk slapd[33415]: slap_queue_csn: queueing 0x419d5c0 20180813145040.294725Z#000000#00c#000000 2018-08-13T15:52:19.385145+01:00 bonsai.authorise-dev.is.ed.ac.uk slapd[33415]: slap_graduate_commit_csn: removing 0x419d5c0 20180813145040.294725Z#000000#00c#000000 2018-08-13T15:52:19.385403+01:00 bonsai.authorise-dev.is.ed.ac.uk slapd[33415]: do_syncrep2: rid=004 LDAP_RES_SEARCH_RESULT 2018-08-13T15:52:19.385654+01:00 bonsai.authorise-dev.is.ed.ac.uk slapd[33415]: do_syncrep2: rid=004 cookie=rid=004,sid=00c,csn=20120217162731.749366Z#000000#000#000000;20180701051000.119854Z#000000#003#000000;20180711011500.114848Z#000000#004#000000;20180710075708.402194Z#000000#005#000000;20180703112408.905121Z#000000#006#000000;20180813144557.086593Z#000000#009#000000;20180813145034.034818Z#000000#00b#000000;20180813145040.294725Z#000000#00c#000000
Kind regards, Mark
On 10/08/18 17:00, Quanah Gibson-Mount wrote:
--On Friday, August 10, 2018 4:41 PM +0000 Mark.Cairney@ed.ac.uk wrote:
Full_Name: Mark Cairney Version: 2.4.46 OS: Centos 7 URL: Submission from: (NULL) (129.215.149.98)
In a MMR setup of >=3 servers (4 in this case) you can cause a segfault if you issue the same modification to 2 separate servers within your replication interval. It seems like it's the last server to receive the change falls over when it receives the change from one of the other servers during replication:
Hi Mark,
I'm pretty sure you're reporting ITS#8843. I suggest applying this patch:
--Quanah
--
Quanah Gibson-Mount Product Architect Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: http://www.symas.com