Full_Name: Hallvard B Furuseth Version: HEAD OS: Linux URL: Submission from: (NULL) (129.240.6.233) Submitted by: hallvard
slapd/cancel.c sets o_abandon before o_cancel. Thus it's possible for the canceled operation to obey o_abandon before o_cancel gets set. Though I had to insert some sleeps to achieve that. Either the operation is abandoned and the Cancel operation receives tooLate, or if the client unbinds/closes the connection fast enough Cancel will hang: slapd does not close the connection, and hangs on shutdown: "slapd shutdown: waiting for 1 operations/tasks to finish".
Since the flags are not mutex-protected (at least not when read), it's not enough to move the o_cancel setting after o_abandon in the Cancel thread. The cancelled thread might still see the o_abandon change first. A fix could be to make o_abandon a bitmask which says whether the abandon is actually a cancel, but the Abandon and Cancel operations will still need a mutex to coordinate so that Abandon does not reset a Cancel bitflag. In any case, it'd be cleaner if an operation which reacts to o_abandon grabs some mutex before checking o_cancel.
The problem was tested as follows: - sleep 0.2 sec after Statslog "DEL" and before setting SLAP_CANCEL_REQ. - log "ABANDONED" when send_ldap_response() abandons the operation. - Client: A python socket client which sends raw BER, no libldap: s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect(('localhost', 3890)) s.send(delete("cn=test")) time.sleep(0.1) s.send(cancel()) # cancel last operation #sys.exit() time.sleep(0.4) s.send(unbind()) --> conn=0 fd=9 ACCEPT from IP=127.0.0.1:56945 (IP=127.0.0.1:3890) conn=0 op=0 DEL dn="cn=test" conn=0 op=1 EXT oid=1.3.6.1.1.8 conn=0 op=1 CANCEL msg=1 conn=0 op=0 ABANDONED conn=0 op=2 UNBIND conn=0 op=1 RESULT oid= err=120 text= conn=0 fd=9 closed <server closed connection, client exited> ^C slapd
If the client exits after send(cancel()): conn=0 fd=9 ACCEPT from IP=127.0.0.1:48826 (IP=127.0.0.1:3890) conn=0 op=0 DEL dn="cn=test" conn=0 op=1 EXT oid=1.3.6.1.1.8 conn=0 op=1 CANCEL msg=1 conn=0 op=2 UNBIND conn=0 op=0 ABANDONED <not closing connection> ^C slapd daemon: shutdown requested and initiated. slapd shutdown: waiting for 1 operations/tasks to finish <slapd is hanging> kill -KILL <slapd>
slapd.conf: include servers/slapd/schema/core.schema allow update_anon database ldif directory "." suffix "cn=test"
Patches to slapd:
Index: cancel.c --- cancel.c 21 Jan 2009 23:40:25 -0000 1.30 +++ cancel.c 11 May 2009 04:42:58 -0000 @@ -92,4 +92,8 @@ }
+ { + struct timeval timeout = { 0, 200000 }; + select(0, NULL, NULL, NULL, &timeout); + } o->o_cancel = SLAP_CANCEL_REQ;
Index: delete.c --- delete.c 21 Jan 2009 23:40:26 -0000 1.144 +++ delete.c 11 May 2009 04:42:58 -0000 @@ -75,4 +75,8 @@ op->o_log_prefix, op->o_req_dn.bv_val, 0, 0, 0 );
+ { + struct timeval timeout = { 0, 200000 }; + select(0, NULL, NULL, NULL, &timeout); + } if( op->o_req_ndn.bv_len == 0 ) { Debug( LDAP_DEBUG_ANY, "%s do_delete: root dse!\n", Index: result.c --- result.c 11 May 2009 02:23:51 -0000 1.331 +++ result.c 11 May 2009 04:42:58 -0000 @@ -418,4 +418,6 @@ if (( rs->sr_err == SLAPD_ABANDON || op->o_abandon ) && !op->o_cancel ) { rc = SLAPD_ABANDON; + Statslog( LDAP_DEBUG_STATS, + "%s ABANDONED\n", op->o_log_prefix, 0, 0, 0, 0 ); goto clean2; }