Re: (ITS#6138) Bad Cancel/Abandon/"internal abandon"/Syncprov interactions - openldap-bugs

2 Jun 2009


      back-ldap:extended.c also does "suppress response, it has been sent",
but does it by returning and setting rs->sr_err = SLAPD_ABANDON.  Might
break assumptions somewhere that SLAPD_ABANDON implies o_abandon was
set.  And I guess the hack fails if the operation gets cancelled.
========================================================================
I think these are the Operation states related to Cancel and Abandon:
op->o_abandon is set for these - could extend to multiple values:
A) Operation Abandoned/Cancelled by client.
B) Operation implicitly abandoned by client. (Bind or lost connection)
C) Operation abandoned by server.  (It wants to close the connection)
D) Suppress response - a duplicate of the operation will proceed. (syncprov)
E) Suppress response - final send_ldap_response() was done. (retcode overlay)
rs->sr_err == SLAPD_ABANDON if:
F) The backend obeyed o_abandon.  (Cancel op, if any, will succeed)
G=E) Suppress response - final send_ldap_response() was done. (back-ldap)
op->o_cancel packs these states/values:
H) The o_abandon is due to a Cancel.
I) Cancel operation wants a result, cancelled op must set it and wait.
J) Result is available to the Cancel operation.
K) Result. (LDAP result code, or SLAP_CANCEL_ACK for success)
L) Cancel operation has fetched result, cancelled operation can proceed.
States that fit in none of the above, or poorly so:
M) Operation must not be waited for, e.g. by Cancel.
   Operation is itself waiting for others, e.g. cn=config update.
N) Operation invisible to Abandon/Cancel/internal abandon.
   msgID reusable due to result sent to client.  Also case D (syncprov)?
Fix by removing the op from op->o_conn->c_ops?  Or does that just
   move the problem around?  Would need to do something to o_conn to
   prevent connection_close() from doing connection_destroy().
O) Operation result has been committed, do not abandon.  ITS#6059.
But o_abandon can be set while trying to commit, unless this flag is
   set before trying - in which case we can't abandon an operation which
   is failing to commit, which may be when it's most relevant.
Could reset o_abandon, if anyone can keep straight the consequences.
   Or replace the 'if ( op->o_abandon )' tests with some macro call.
   Still, interactions with other states could be a problem.
About the o_abandon values above:
B can be treated like A, I think.
C differs in that Cancel/Abandon(operation) should not say "already
  abandoned" since the client doen't know about the abandon.
  Could be solved with a vague error message.
D-E differ in that o_abandon gets set even though the backends'
cancel/abandon handlers were not called.  Unsure of the effects of that.
D Syncprov duplicating a Persistent Search operation.
  Handled similar to a server-initiated abandon?  Except if the
  operation cannot be "invisible to Abandon/Cancel" above it must remain
  possible to Abandon/Cancel it.
E Suppress response - response has been sent:
  Set when exiting slap_send_ldap_result() & co?
Handled similar to a server-initiated abandon?
  At the time slap_send_ldap_result() is called again, the operation
  may have set up things which need to be cleaned up in the normal way.
  Yet it has already gone through that function once, doing callbacks
  etc.  Must "final response" code be prepared to be called twice?
Beyond that, the main problem would be code which transitions to one
state to another, it needs to handle the other cases.
-- 
Hallvard