Full_Name: Jan Safranek Version: 2.4.13 OS: Linux. Fedora 9, x86_64 URL: http://people.redhat.com/jsafrane/openldap-referral-crash/ Submission from: (NULL) (89.102.11.105)
Following ldapsearch crashes when referral chasing is enabled in very unusual environment with many AD servers referring to each other (I had to increase refhoplimit to chase all the referrals).
$ ldapsearch -x -C -D cn=uzivatelk,ou=Services,ou=BOUDA,ou=administrace,dc=broken,dc=net -w 'mojeheslo' -b "dc=broken,dc=net" -h f9 "(sAMAccountName=uzivatel)"
ldapsearch: sockbuf.c:88: ber_sockbuf_ctrl: Assertion `sb != ((void *)0)' failed.
You can find the stack trace below and ldapsearch -d -1 output at http://people.redhat.com/jsafrane/openldap-referral-crash/, together with LDAP traffic in pcap format.
IMO, if there are more referrals involved (A refers to B, which refers to C), wait4msg() and try_read1msg() functions are called recursively several times. When referral to C is being resolved, the stack looks like this:
1: try_read1msg 2: wait4msg 3: ldap_result ... 4: ldap_chase_v3referrals 5: try_read1msg 6: wait4msg 7: ldap_result ... 8: ldap_chase_v3referrals 9: try_read1msg 10: ldap_result 11: wait4msg 12: do_search 13: main
Looking at the wait4msg(), result.c:393, it goes through all connections inside LDAP structure: for ( lc = ld->ld_conns; rc == LDAP_MSG_X_KEEP_LOOKING && lc != NULL; ) { if ( lc->lconn_status == LDAP_CONNST_CONNECTED && ldap_is_read_ready( ld, lc->lconn_sb ) ) { <...mutexes...> rc = try_read1msg( ld, msgid, all, &lc, result ); <...mutexes...> if ( lc == NULL ) { /* if lc gets free()'d, * there's no guarantee * lc->lconn_next is still * sane; better restart * (ITS#4405) */ lc = ld->ld_conns;
/* don't get to next conn! */ break; } }
/* next conn */ lc = lc->lconn_next; }
And try_read1msg() can release the connection and change the list of connections, see result.c:944:
ldap_free_connection( ld, lc, 0, 1 ); <...mutexes...> lc = *lcp = NULL;
The calling wait4msg() expects that called try_read1msg() may free a connection and does not crash. But if more wait4msg() are called recursively, the "bottom ones", #6 and #11 in my stack example above, do not know about the change which try_read1msg() in #1 has made and can hold already freed 'lc' pointer -> crash.
IMHO there should be some other check, that ensures that wait4msg() restarts the loop even if some inner try_read1msg() modifies the list. Or there could be some ldap_use_connecion() in try_read1msg() when referrals are being chased, so the connection does get freed only by the right try_read1msg() - #5 in my case and not #1. However, my knowledge of OpenLDAP internals is somewhat limited and I am not able to tell which way is the best.
It's quite hard to reproduce the bug, the packets must arrive in the right order, so the connection to B is being freed when referral to C is being chased. I am not able to reproduce it in more controlled environment, the log is quite huge and it contains some other not so interesting referrals. The connection, which is freed too early, is connection to 'sec.broken.net' - look for the last 'ldap_free_connection: actually freed' in the log.
I hope my explanation above is somewhat readable and I didn't confuse you too much :). Do not hesitate to ask for specific details.
Here is the promised real stack trace. It asserts when the 'lc' in wait4msg() has been freed by previous call to try_read1msg.
(gdb) bt full #0 0x0000003b49632215 in raise () from /lib64/libc.so.6 No symbol table info available. #1 0x0000003b49633d83 in abort () from /lib64/libc.so.6 No symbol table info available. #2 0x0000003b4962b039 in __assert_fail () from /lib64/libc.so.6 No symbol table info available. #3 0x000000000044d4b1 in ber_sockbuf_ctrl (sb=0x0, opt=8, arg=0x0) at sockbuf.c:88 p = (Sockbuf_IO_Desc *) 0x452f65 ret = 0 __PRETTY_FUNCTION__ = "ber_sockbuf_ctrl" #4 0x0000000000416222 in try_read1msg (ld=0x1b80fa0, msgid=18, all=1, lcp=0x7fffe4f67ae8, result=0x7fffe4f67be8) at result.c:1190 ber = (BerElement *) 0x1b8c040 newmsg = (LDAPMessage *) 0x1b91ca0 l = (LDAPMessage *) 0x0 prev = (LDAPMessage *) 0x1b8d370 id = 2 idx = 0 tag = 115 len = 78 foundit = 0 lr = (LDAPRequest *) 0x0 tmplr = (LDAPRequest *) 0x1b8d220 dummy_lr = {lr_msgid = 0, lr_status = 0, lr_refcnt = 0, lr_outrefcnt = 0, lr_abandoned = 0, lr_origid = 0, lr_parentcnt = 0, lr_res_msgtype = 0, lr_res_errno = 0, lr_res_error = 0x0, lr_res_matched = 0x0, lr_ber = 0x0, lr_conn = 0x0, lr_dn = {bv_len = 0, bv_val = 0x0}, lr_parent = 0x0, lr_child = 0x0, lr_refnext = 0x0, lr_prev = 0x0, lr_next = 0x0} lc = (LDAPConn *) 0x1b8e3e0 tmpber = {ber_opts = {lbo_valid = 2, lbo_options = 1, lbo_debug = -1}, ber_tag = 0, ber_len = 87, ber_usertag = 0, ber_buf = 0x1b8c0a0 "\002\001\005s\204", ber_ptr = 0x1b8c0f7 "", ber_end = 0x1b8c0f7 "", ber_sos = 0x0, ber_rwptr = 0x0, ber_memctx = 0x0} rc = -2 refer_cnt = 1 hadref = 1 simple_request = 0 err = 5 lderr = 1 __PRETTY_FUNCTION__ = "try_read1msg" bv_nod = {bv_len = 22, bv_val = 0x452b0b "1.3.6.1.4.1.1466.20036"} #5 0x0000000000414c67 in wait4msg (ld=0x1b80fa0, msgid=18, all=1, timeout=0x7fffe4f67bf0, result=0x7fffe4f67be8) at result.c:402 err = 0 lc_ready = 0 rc = -2 tv = {tv_sec = 0, tv_usec = 100000} tv0 = {tv_sec = 0, tv_usec = 100000} start_time_tv = {tv_sec = 1228828417, tv_usec = 878311} tvp = (struct timeval *) 0x7fffe4f67b10 lc = (LDAPConn *) 0x1b8e3e0 __PRETTY_FUNCTION__ = "wait4msg" #6 0x0000000000414433 in ldap_result (ld=0x1b80fa0, msgid=18, all=1, timeout=0x7fffe4f67bf0, result=0x7fffe4f67be8) at result.c:127 lm = (LDAPMessage *) 0x0 rc = 18 __PRETTY_FUNCTION__ = "ldap_result" #7 0x0000000000428313 in ldap_new_connection (ld=0x1b80fa0, srvlist=0x7fffe4f67df0, use_ldsb=0, connect=1, bind=0x7fffe4f67dc0) at request.c:560 tv = {tv_sec = 0, tv_usec = 100000} res = (LDAPMessage *) 0x0 msgid = 18 rc = 0 passwd = {bv_len = 0, bv_val = 0x0} err = 1 savedefconn = (LDAPConn *) 0x1b8a380 lc = (LDAPConn *) 0x1b8c690 async = 0 __PRETTY_FUNCTION__ = "ldap_new_connection" ---Type <return> to continue, or q <return> to quit--- #8 0x00000000004276b5 in ldap_send_server_request (ld=0x1b80fa0, ber=0x1b8d290, msgid=17, parentreq=0x1b8a6e0, srvlist=0x7fffe4f67df0, lc=0x0, bind=0x7fffe4f67dc0) at request.c:207 lr = (LDAPRequest *) 0x1b8d290 incparent = 1 rc = 389 #9 0x0000000000429be6 in ldap_chase_v3referrals (ld=0x1b80fa0, lr=0x1b8cb40, refs=0x0, sref=1, errstrp=0x1b8cb70, hadrefp=0x7fffe4f67f2c) at request.c:1209 unfollowed = 0x0 unfollowedcnt = 0 origreq = (LDAPRequest *) 0x1b8a6e0 srv = (LDAPURLDesc *) 0x1b8c7b0 ber = (BerElement *) 0x1b8d290 refarray = (char **) 0x1b8c670 lc = (LDAPConn *) 0x0 rc = 0 count = 0 i = 0 j = 0 id = 17 rinfo = {ri_msgid = 2, ri_request = 99, ri_url = 0x1b91c30 "ldap://DomainDnsZones.sec.broken.net/DC=DomainDnsZones,DC=sec,DC=broken,DC=net"} #10 0x000000000041553d in try_read1msg (ld=0x1b80fa0, msgid=-1, all=0, lcp=0x7fffe4f680e8, result=0x7fffe4f68228) at result.c:729 refs = (char **) 0x1b8c670 ber = (BerElement *) 0x1b8ae00 newmsg = (LDAPMessage *) 0x0 l = (LDAPMessage *) 0x1e4f68080 prev = (LDAPMessage *) 0x7fffe4f68080 id = 2 idx = 0 tag = 115 len = 80 foundit = 0 lr = (LDAPRequest *) 0x1b8cb40 tmplr = (LDAPRequest *) 0x1b8a550 dummy_lr = {lr_msgid = 0, lr_status = 0, lr_refcnt = 0, lr_outrefcnt = 0, lr_abandoned = 0, lr_origid = 0, lr_parentcnt = 0, lr_res_msgtype = 0, lr_res_errno = 0, lr_res_error = 0x0, lr_res_matched = 0x0, lr_ber = 0x0, lr_conn = 0x0, lr_dn = {bv_len = 0, bv_val = 0x0}, lr_parent = 0x0, lr_child = 0x0, lr_refnext = 0x0, lr_prev = 0x0, lr_next = 0x0} lc = (LDAPConn *) 0x1b8beb0 tmpber = {ber_opts = {lbo_valid = 2, lbo_options = 1, lbo_debug = -1}, ber_tag = 0, ber_len = 89, ber_usertag = 0, ber_buf = 0x1b8c600 "\002\001\003s\204", ber_ptr = 0x1b8c659 "", ber_end = 0x1b8c659 "", ber_sos = 0x0, ber_rwptr = 0x0, ber_memctx = 0x0} rc = -2 refer_cnt = 0 hadref = 0 simple_request = 0 err = 4 lderr = 1 __PRETTY_FUNCTION__ = "try_read1msg" bv_nod = {bv_len = 22, bv_val = 0x452b0b "1.3.6.1.4.1.1466.20036"} #11 0x0000000000414c67 in wait4msg (ld=0x1b80fa0, msgid=-1, all=0, timeout=0x0, result=0x7fffe4f68228) at result.c:402 err = 0 lc_ready = 0 rc = -2 tv = {tv_sec = 0, tv_usec = 0} tv0 = {tv_sec = 0, tv_usec = 0} start_time_tv = {tv_sec = 0, tv_usec = 0} tvp = (struct timeval *) 0x0 lc = (LDAPConn *) 0x1b8beb0 __PRETTY_FUNCTION__ = "wait4msg" #12 0x0000000000414433 in ldap_result (ld=0x1b80fa0, msgid=-1, all=0, timeout=0x0, result=0x7fffe4f68228) at result.c:127 lm = (LDAPMessage *) 0x0 rc = 100 __PRETTY_FUNCTION__ = "ldap_result" ---Type <return> to continue, or q <return> to quit--- #13 0x0000000000408dbf in dosearch (ld=0x1b80fa0, base=0x1b7f5e0 "dc=broken,dc=net", scope=2, filtpatt=0x0, value=0x7fffe4f6b881 "(sAMAccountName=uzivatel)", attrs=0x0, attrsonly=0, sctrls=0x0, cctrls=0x0, timeout=0x0, sizelimit=-1) at ldapsearch.c:1280 filter = 0x7fffe4f6b881 "(sAMAccountName=uzivatel)" rc = 0 nresponses = 7 nentries = 1 nreferences = 6 nextended = 0 npartial = 0 res = (LDAPMessage *) 0x0 msg = (LDAPMessage *) 0x0 msgid = 2 retoid = 0x0 retdata = (struct berval *) 0x0 nresponses_psearch = -1 cancel_msgid = -1 #14 0x00000000004084cc in main (argc=14, argv=0x7fffe4f6a4f8) at ldapsearch.c:1112 filtpattern = 0x7fffe4f6b881 "(sAMAccountName=uzivatel)" attrs = (char **) 0x0 line = '\0' <repeats 1344 times>, "\006\000\000\000\000\000\000\000\200\220öäÿ\177\000\000\220\220öäÿ\177\000\000\231C!J;\000\000\000\237C!J;\000\000\000\023pdI;", '\0' <repeats 47 times>, "º\005\000\000\004\000\000\000\000\000\000\000à\220öäÿ\177\000\000ð\220öäÿ\177\000\000ÔB!J;\000\000\000ØB!J;\000\000\000\023pdI;", '\0' <repeats 47 times>, "º\005\000\000X\220öäÿ\177\000\000x\220öäÿ\177\000\000 \220öäÿ\177\000\000p\220öäÿ\177\000\000Ð\213öäÿ\177", '\0' <repeats 18 times>, "\205ç Iøÿÿÿ\000\000\000\000\000\000\000\000èÔAI;\000\000\000"... fp = (FILE *) 0x0 rc = 4213523 rc1 = 0 i = 0 first = 32767 ld = (LDAP *) 0x1b80fa0 seber = (BerElement *) 0x0 vrber = (BerElement *) 0x0 syncber = (BerElement *) 0x0 syncbvalp = (struct berval *) 0x0 err = 4216832 (gdb)