Rein Tollevik wrote:
hyc@symas.com wrote:
rein@OpenLDAP.org wrote:
I had a couple of seg. faults when resync'ing my servers after upgrading to the
upcoming 2.4.16 release. Looks as if a copy of the backend must be used when
testing the filter in syncprov_matchops. See the gdb output at the end. Note,
some function names are incorrect due to optimization. A fix is coming.
The fix makes no sense, or the problem has not yet been analyzed sufficiently.
Nobody in that call chain should be zeroing out bd_info. And if someone *is*,
then it will happen in *whatever* BackendDB structure is currently being used.
Explain the real cause of the problem, and why the fix is correct.
The problem is not zeroing of bd_info, it is that the entire op2.o_bd
points to garbage, as the gdb output shows. I did forgot to print
ss->s_op->o_bd though. op2 is a copy of *ss->s_op, but op2.o_bd and
ss->s_op->o_bd differ. The content of *ss->s_op->o_bd looks reasonable.
The copying of *ss->s_op into op2 was introduced in rev 1.233 as a fix
to ITS#5486. It doesn't say why this was the correct fix, but I assume
it was done because something could modify *ss->s_op while the filter
was being tested.
Yes of course, particularly op->o_callback.
Btw, the gdb output from ITS#5486 shows a db with
similar garbage, so I suspect that these ITSes are related.
Assuming that something could mess with ss->s_op they might as well mess
with ss->s_op->o_bd. The copying of the op->o_bd that takes place all
around is a nightmare! I have no clue as to who modified *ss->s_op
and/or *ss->s_op->o_bd, and I'm not very satisfied with the fact that
something did. Finding out why this happened may be the correct fix.
Yes, that's my point.
Down the road we need to fix things so that all this copying is unnecessary;
it just involves adding an op->o_bd_info field so that we no longer need to
change anything in the op->o_bd itself. But in the meantime, we need to find
out why an invalid op->o_bd is there. Most likely some lower function
temporarily put a stack'd copy in there and didn't restore the original value
before returning. And again, if that's the case, then it doesn't matter what
value you set higher up, copy or not it will still point to garbage.
--
-- Howard Chu
CTO, Symas Corp.
http://www.symas.com
Director, Highland Sun
http://highlandsun.com/hyc/
Chief Architect, OpenLDAP
http://www.openldap.org/project/