Hi everyone!
I'm working on a project using openLDAP C API (version 2.4.36) in a asynchronous way. Everything works quite well after two years of development cycles and product evolution. Since a year ago we have a few clients successfully running our LDAP module on their servers.
Recently I've
received a core dump file from one of our clients, with this stack
frames:
... libc frames ...
#6 0x00007f68887d3068 in ldap_int_bisect_find (v=<value optimized out>, n=<value optimized out>, id=<value optimized out>, idxp=<value optimized out>) at abandon.c:334
#7 0x00007f68887d32d2 in do_abandon (ld=0x7f67dcb0bbe0, origid=-1, msgid=-1, sctrls=<value optimized out>, sendabandon=1) at abandon.c:300
... my application frames ...
From my code, I'm
calling openldap_ldap_abandon_ext(ld, msgid, NULL,
NULL) because a timeout has been reached after doing a
openldap_ldap_sasl_bind(...) and getting LDAP_X_CONNECTING state,
while waiting for the result of an LDAP_SUCCESS.
As I've read on
abandon.c, the assert( id >= 0 ) is executed only on certain
flows, I guess... those which are involved in communication handshake
with the server in advanced stages. So I think, I'm calling
openldap_ldap_abandon_ext(...) at the wrong time.
My question is:
can I use something from the API (ldap.h) to prevent calling
openldap_ldap_abandon_ext on this specific situation? I think I may
add code in my application to prevent crashing, but also ensure aborting the
connection correctly, as I have to respect my timeout policy.
BTW, the calls to the
openLDAP API in my code are all protected with the same
boost::unique_lock<boost::mutex> to ensure thread safety. Logs
shows that my module was under heavy load when the application
crashed. I've only this core information, and I haven't been able to
reproduce this situation on my integration tests, even doing a
simulation of a slowdown in networking communications and shrinking
timeouts.
Thanks in advance
for your help!