hyc@symas.com wrote:
rein@OpenLDAP.org wrote:
I had a couple of seg. faults when resync'ing my servers after upgrading to the upcoming 2.4.16 release. Looks as if a copy of the backend must be used when testing the filter in syncprov_matchops. See the gdb output at the end. Note, some function names are incorrect due to optimization. A fix is coming.
The fix makes no sense, or the problem has not yet been analyzed sufficiently. Nobody in that call chain should be zeroing out bd_info. And if someone *is*, then it will happen in *whatever* BackendDB structure is currently being used.
Explain the real cause of the problem, and why the fix is correct.
The problem is not zeroing of bd_info, it is that the entire op2.o_bd points to garbage, as the gdb output shows. I did forgot to print ss->s_op->o_bd though. op2 is a copy of *ss->s_op, but op2.o_bd and ss->s_op->o_bd differ. The content of *ss->s_op->o_bd looks reasonable.
The copying of *ss->s_op into op2 was introduced in rev 1.233 as a fix to ITS#5486. It doesn't say why this was the correct fix, but I assume it was done because something could modify *ss->s_op while the filter was being tested. Btw, the gdb output from ITS#5486 shows a db with similar garbage, so I suspect that these ITSes are related.
Assuming that something could mess with ss->s_op they might as well mess with ss->s_op->o_bd. The copying of the op->o_bd that takes place all around is a nightmare! I have no clue as to who modified *ss->s_op and/or *ss->s_op->o_bd, and I'm not very satisfied with the fact that something did. Finding out why this happened may be the correct fix.
Rein