jvcelak@redhat.com wrote:
Here is what I have tried. This is the processing of the failing ldap_result():
(gdb) c Continuing.
Breakpoint 3, ldap_is_read_ready (ld=0x7f48f3a170a0, sb=0x7f48f3a22580) at ../../../libraries/libldap/os-ip.c:1005 1005 sip = (struct selectinfo *)ld->ld_selectinfo; (gdb) n 1007 if (ber_sockbuf_ctrl( sb, LBER_SB_OPT_DATA_READY, NULL )) (gdb) 1010 ber_sockbuf_ctrl( sb, LBER_SB_OPT_GET_FD,&sd ); (gdb) 1016 for(i=0; i< sip->si_maxfd; i++) { (gdb) 1017 if( sip->si_fds[i].fd == sd ) { (gdb) 1018 return sip->si_fds[i].revents& POLL_READ; (gdb) p sip->si_fds[0].revents $14 = 4 (gdb) bt full #0 ldap_is_read_ready (ld=0x7f48f3a170a0, sb=0x7f48f3a22580) at ../../../libraries/libldap/os-ip.c:1018 i = 0 sip = 0x7f48f3a2f590 sd = 17 #1 0x00007f48f17ab743 in wait4msg (ld=0x7f48f3a170a0, msgid=-1, all=0, timeout=0x7fff16e75da0, result=0x7fff16e75d60) at ../../../libraries/libldap/result.c:357 lnext = 0x0 serviced = 0 lc_ready = 1 rc = -2 tv = {tv_sec = 0, tv_usec = 0} tv0 = {tv_sec = 0, tv_usec = 0} start_time_tv = {tv_sec = 1329407008, tv_usec = 270973} tvp = 0x7fff16e75c70 lc = 0x7f48f3a2db80 __PRETTY_FUNCTION__ = "wait4msg" #2 0x00007f48f17aaf0e in ldap_result (ld=0x7f48f3a170a0, msgid=-1, all=0, timeout=0x7fff16e75da0, result=0x7fff16e75d60) at ../../../libraries/libldap/result.c:117 rc = 32584 __PRETTY_FUNCTION__ = "ldap_result" #3 0x00007f48e6da897e in sdap_process_result (ev=0x7f48f39f94d0, pvt=0x7f48f3a38b70) at src/providers/ldap/sdap_async.c:179 sh = 0x7f48f3a38b70 no_timeout = {tv_sec = 0, tv_usec = 0} te = 0x7f48f3a2cab0 msg = 0x0 ret = 101 __FUNCTION__ = "sdap_process_result" #4 0x00007f48e6da8594 in sdap_ldap_next_result (ev=0x7f48f39f94d0, te=0x7f48f3a2cab0, tv=..., pvt=0x7f48f3a38b70) at src/providers/ldap/sdap_async.c:159 No locals. #5 0x00007f48f2f19310 in tevent_common_loop_timer_delay (ev=ev@entry=0x7f48f39f94d0) at ../tevent_timed.c:254 current_time = {tv_sec = 0, tv_usec = 0} te = 0x7f48f3a2cab0 #6 0x00007f48f2f189ec in std_event_loop_once (ev=0x7f48f39f94d0, location=<optimized out>) at ../tevent_standard.c:558 std_ev = 0x7f48f39f9590 tval = {tv_sec = 0, tv_usec = 0} #7 0x00007f48f2f15cb0 in _tevent_loop_once (ev=ev@entry=0x7f48f39f94d0, location=location@entry=0x7f48f37e52e7 "src/util/server.c:572") at ../tevent.c:504 ret =<optimized out> nesting_stack_ptr = 0x0 #8 0x00007f48f2f15e3b in tevent_common_loop_wait (ev=0x7f48f39f94d0, location=0x7f48f37e52e7 "src/util/server.c:572") at ../tevent.c:605 ret =<optimized out> #9 0x00007f48f37b8a85 in server_loop (main_ctx=0x7f48f39fa640) at src/util/server.c:572 No locals. #10 0x00007f48f3779a88 in main (argc=5, argv=0x7fff16e763a8) at src/providers/data_provider_be.c:2003 opt = -1 pc = 0x7f48f39f8010 be_domain = 0x7f48f39f8420 "foo" srv_name = 0x7f48f39f80d0 "sssd[be[foo]]" main_ctx = 0x7f48f39fa640 confdb_path = 0x7f48f39f8140 "config/domain/foo" ret = 0 long_options = {{longName = 0x0, shortName = 0 '\000', argInfo = 4, arg = 0x7f48f2d06200, val = 0, descrip = 0x7f48f37dcb7c "Help options:", argDescrip = 0x0}, {longName = 0x7f48f37dcb8a "debug-level", shortName = 100 'd', argInfo = 2, arg = 0x7f48f39f4738, val = 0, descrip = 0x7f48f37dcb96 "Debug level", argDescrip = 0x0}, {longName = 0x7f48f37dcba2 "debug-to-files", shortName = 102 'f', argInfo = 0, arg = 0x7f48f39f473c, val = 0, descrip = 0x7f48f37dcbb8 "Send the debug output to files instead of stderr", argDescrip = 0x0}, { longName = 0x7f48f37dcbe9 "debug-timestamps", shortName = 0 '\000', argInfo = 2, arg = 0x7f48f39f4598, val = 0, descrip = 0x7f48f37dcbfa "Add debug timestamps", argDescrip = 0x0}, {longName = 0x7f48f37dcc0f "debug-microseconds", shortName = 0 '\000', argInfo = 2, arg = 0x7f48f39f459c, val = 0, descrip = 0x7f48f37dcc28 "Show timestamps with microseconds", argDescrip = 0x0}, {longName = 0x7f48f37dcc4a "domain", shortName = 0 '\000', argInfo = 1, arg = 0x7fff16e76058, val = 0, descrip = 0x7f48f37dcc58 "Domain of the information provider (mandatory)", argDescrip = 0x0}, {longName = 0x0, shortName = 0 '\000', argInfo = 0, arg = 0x0, val = 0, descrip = 0x0, argDescrip = 0x0}} __FUNCTION__ = "main" (gdb) fin Run till exit from #0 ldap_is_read_ready (ld=0x7f48f3a170a0, sb=0x7f48f3a22580) at ../../../libraries/libldap/os-ip.c:1028 wait4msg (ld=0x7f48f3a170a0, msgid=-1, all=0, timeout=0x7fff16e75da0, result=0x7fff16e75d60) at ../../../libraries/libldap/result.c:356 356 if ( lc->lconn_status == LDAP_CONNST_CONNECTED&& Value returned is $15 = 0 (gdb) c Continuing.
Breakpoint 2, wait4msg (ld=0x7f48f3a170a0, msgid=-1, all=0, timeout=0x7fff16e75da0, result=0x7fff16e75d60) at ../../../libraries/libldap/result.c:378 378 rc = -1;
As you may see, there is only a POLLOUT flag set:
(gdb) p sip->si_fds[0].revents $14 = 4
/usr/include/bits/poll.h:
#define POLLOUT 0x004 /* Writing now will not block. */
So I believe, that there is some other problem, because lc_ready is 1. And without the reverted patch, the operation would just keep looping with LDAP_MSG_X_KEEP_LOOKING. (Or will timeout alternatively.)
Your debug session indicates that the socket is writable but libldap doesn't have any record of being blocked for any writes, so it doesn't check for writable state.
Frankly this sounds more like an sssd bug, since libldap is still working fine for just about everything else out there. Perhaps you should set libldap's debug level to 7 and see what actually got handled in the library up to the point of failure.