jclarke@linagora.com wrote:
Full_Name: Jonathan Clarke Version: RE24 OS: CentOS 5.2 x64 URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (213.41.243.192)
Hi all,
We have encountered a segfault that occurs in syncrepl.c, on line 1449 (from a checkout of RE24 today).
This occurs during replication of cn=config, by a thread in do_syncrepl on rid=999. Syncrepl pulls in new values for olcSyncRepl on cn=config. This deletes the configuration for rid=999, and adds configuration for rid=001 and rid=002.
Thanks for the report, now fixed in HEAD.
Some debugging (lots) showed the following.
This thread enters syncrepl_config() 3 times:
- Delete rid=999. This detects this syncrepl is running, and sets si->si_ctype
= 0 (as described in the comments in syncrepl.c line 4358). si is not freed, since it's running. The c->be->be_syncinfo->si_cookieState is freed (line 4564). 2) Add rid=001 3) Add rid=002
Back in syncrepl.c on line 1449, the "final delete cleanup" looks for it's own si in be->be_syncinfo. It of course doesn't find it (the config is deleted), and segfaults when sip = null.
We have tried changing these lines to: for ( sip =&be->be_syncinfo; sip != NULL || *sip != si; sip =&(*sip)->si_next ); if (sip) { *sip = si->si_next; }
However, we then encounter another segfault on line 1448, involving be->be_syncinfo->si_cookieState, which was freed... It's a bit too much for us at this point, do you have any more ideas?