I wrote:
In addition to the issues listed in my last message, this ITS depends on these: [ITS#6763,#6776#6760].
Sorry, that was quite wrong. The changes done so far can stand alone, I've tried hard for that. Taking this ITS _further_ depends on those ITSes. But those ITSes are safest to tackle after the changes I've done so far.
I've done this ITS backwards after some IRC miscommunication and by sitting too close to the code: First commit a huge patch to clean up SlapReply, then explain why. Sorry about that. I intended to get it done for 2.4.24, but when Howard balked at this approach I expect I instead ended up complicating HEAD->RE24 merges.
So to explain better: This is ITS#5340 with a better name. See http://www.openldap.org/its/?findid=5340#followup16 - 18 which actually explains most of the rationale for these changes:
"I'm a bit nervous about doing more than minimal fixes to this until we do something major, because it's so broken. (...) [Howard suggested] code which puts something into a SlapReply should ensure it's taken out again. (...) I think that's not the right fix because it's the most fragile solution: Practically every module which depends on it can break if some other module gets it wrong. OTOH, modules can't just reset SlapReply since some of it is input values to op_*() (...)"
SlapReply is so broken and confusing that for 3 years I've not dared to do the obvious fixes to ITS#5340. Early on I kept prompting others who understood slapd better than me to take on #5430, but nothing happened.
So now I've "done something major": Cleaned up SlapReply enough that it's no longer an act of bravery to take on the original issues. Most of the changes are a preliminary cleanup for going for the original ITS#5340 fixes (obeying the flags). elsewhere:
By the original design, it's not a bug for an op_*() function to expect an incoming SlapReply to be clean, nor to output a "dirty" SlapReply, so it' has got to be wrong to reuse a SlapReply without cleaning it up. So I've taken a simpler and more robust approach this time, adding bigger but more trivial changes: In addition to cleaning up some flags, I've killed most cases of SlapReply reuse, which also isolates what damage one module can do to another through SlapReply. Even if they are third-party overlays where the first is broken because it accurately copied what OpenLDAP's overlays were doing:(
I've also mostly fixed rs.sr_type. That one was almost as broken, somewhat in the spirit of this awful and confusing idea in result.c: /* FIXME: I think rs->sr_type should be explicitly set to * REP_SEARCH here. That's what it was when we entered this * function. send_ldap_error may have changed it, but we * should set it back so that the cleanup functions know * what they're doing. */ No. If something set rs.sr_type = REP_INTERMEDIATE, result.c should not tell its caller that OID rs.sr_rspoid is an entry, rs.sr_entry.
As for user-visible issues the cleanup caught: Half of my December and January ITSes came from this effort. Some minor, some bigger. Generally issues somewhat out of scope of what I was doing in this ITS.
Howard asked if all these changes fixed any crashes. Thinking it over, I've had few crashes because of this mess, and I have had a hard time trying to remember which ones. I've had many crashes in the truckload of asserts I added in my private code, which pointed out these ITSes and how to clean up. I could not have found the ITSes before the cleanup, since the asserts then brought down slapd in an instant.
Also I've broken slapd a number of times by attempting "obvious" fixes before cleaning up, because slapd is more creative than me about reusing SlapReply strangely. Like pcache.c which reused a Search SlapReply with an entry in it for be_modify() and expected the entry to still be there after the modify:-( Looks like a good candidate for trying to construct a pre-cleanup crash, if I ever find time to try: Find a way to feed that SlapReply to something which reuses it for a Search or an Intermediate Response, thus overwriting the entry sitting in the Modify SlapReply.
That kind if thing is why I'm much more eager to get the initial cleanup into RE24 than to get the final entry leak fixes etc into RE24. After the cleanup, there should only be an entry in there if sr_type is REP_SEARCH/REP_SEARCHREF, not also REP_RESULT/REP_GLUE_RESULT as before.
ITSes: ITS#6760 slapo-rwm broken entry handling ITS#6752 slapo-dynlist issues [+ related non-SlapReply issues] ITS#6746, #6763 entry/control leaks [+ related non-SlapReply issues] My commits using rs_flush_entry() & co have fixed some other entry leaks or potential leaks. Some modules - in particular BDB - catch potential leaks of their entries. Others do not, giving us memory leaks. ITS#6731, #6748 used scrambled sr_text on an exited function's stack. ITS#6774, ITS#6785, some of my back-config commits: sr_text/sr_matched sent with sr_err=LDAP_SUCCESS, or related to another error than sr_err.
Browsing 2.4.24 patches, an earlier cleanup would have caught #6633 (slapd-bdb error propagation) - my asserts with test047 catch sr_text != NULL when sr_err == 0 - and likely #6629 (freeing controls).