Full_Name: Paul Smith Version: 2.4.7 OS: Ubuntu 8.04 URL: Submission from: (NULL) (65.78.30.67)
I've been trying to track down a crasher in Evolution's Exchange server backend, and I believe it's a bug in OpenLDAP's libldap. The core manifests as an assert() in io.c:ber_flush2(), but by using valgrind I can see that the real problem is using already-freed memory.
You can see various backtraces and valgrind outputs in the Gnome bug tracking system here: http://bugzilla.gnome.org/show_bug.cgi?id=512605 Note that the later stack traces are better as I've installed debugging versions of OpenLDAP so there are better symbols shown.
In Evolution we have a loop we use to look up global contacts, which looks like this:
for (try = 0; try < 2; try++) { ldap_error = get_gc_connection (gc, op); if (ldap_error != LDAP_SUCCESS) return ldap_error; ldap_error = ldap_search_ext (gc->priv->ldap, base, scope, filter, (char **)attrs, FALSE, NULL, NULL, NULL, 0, &msgid); if (ldap_error == LDAP_SERVER_DOWN) continue; else if (ldap_error != LDAP_SUCCESS) return ldap_error;
ldap_error = gc_ldap_result (gc->priv->ldap, op, msgid, msg); if (ldap_error == LDAP_SERVER_DOWN) continue; else if (ldap_error != LDAP_SUCCESS) return ldap_error;
return LDAP_SUCCESS; }
where get_gc_connection() eventually calls ldap_ntlm_bind() and where gc_ldap_result() calls ldap_result(). When the core happens, the first trip through the for loop here gives an error LDAP_SERVER_DOWN as the return code from ldap_result(). We go back to the top of the loop, and ldap_ntlm_bind() is called. That eventually ends up in ldap_send_initial_request() which invokes ldap_send_server_request() like this:
rc = ldap_send_server_request( ld, ber, msgid, NULL, NULL, NULL, NULL );
Inside ldap_send_server_request() we need to find an LDAPConn* structure, but the one passed in is NULL as is the srvlist, so, we use the default connection:
if ( lc == NULL ) { if ( srvlist == NULL ) { lc = ld->ld_defconn;
Valgrind reports that the very next line, attempting to access lc->lconn_status, is referencing freed memory:
if ( lc != NULL && lc->lconn_status == LDAP_CONNST_CONNECTING ) {
So, basically, we know that ld->ld_defconn is a non-NULL pointer pointing to already freed memory. This seems like it is a buggy state.
How did we get here? In ldap_result() we get into this loop invoking try_read1msg() (I'm removing the mutex locking for readability, even though it IS enabled in my version of libldap):
for ( lc = ld->ld_conns; rc == LDAP_MSG_X_KEEP_LOOKING && lc != NULL; ) { if ( lc->lconn_status == LDAP_CONNST_CONNECTED && ldap_is_read_ready( ld, lc->lconn_sb ) ) { rc = try_read1msg( ld, msgid, all, &lc, result ); if ( lc == NULL ) { /* if lc gets free()'d, * there's no guarantee * lc->lconn_next is still * sane; better restart * (ITS#4405) */ lc = ld->ld_conns;
/* don't get to next conn! */ break; } }
/* next conn */ lc = lc->lconn_next; }
Inside try_read1msg() we call "tag = ber_get_next( lc->lconn_sb, &len, ber );" and if the tag we get back is LBER_DEFAULT, then we declare an error and we free the connection and set the pointer to NULL:
case LBER_DEFAULT: ld->ld_errno = LDAP_SERVER_DOWN; ldap_free_connection( ld, lc, 1, 0 ); lc = *lcp = NULL; return -1;
If you check ldap_free_connection() you'll see that it removes the LDAPConn pointer "lc" from the list of connections before it is freed.
BUT! The ldap_free_connection() function never does anything with the ld->ld_defconn pointer, so if the connection we just freed is the one pointed to by ld->ld_defconn, it is now pointing to freed memory. And that causes the problem detected above by valgrind, or causing an assert later on: accessing freed memory.
I'm not really sure what the right thing to do here is, or I'd provide a patch. Should we set ld_defconn to NULL? Is that ever a valid state? Or should we just pick another connection from the list (and what if there isn't one?)