Dear team
We hope you can help us in discovering where or
how we are wrong.
We are providing a new sasl mechanism for third party
strong authentication and we are in the final delivery step.
We are current with openldap 2.3.38.
We used strong authentication server to carry out the
authorization part of the mech and our backside
(sasl client) is Openldap 2.3.38.
On the client side we have pam-ldap+ssh.
sasl libs are cyrus 2.1.22
os is linux red-hat cent-os 4.7 final
During our functional and stress tests everything
works fine but once in a while when openldap crashes
and this happens only, but not always, under a certain
condition:
Before explaining such a condition we would like to establish
the meaning of some terms:
1) straight authentication session: a simple (not
interactive) authentication, being either successfull or not
(right or wrong passcode)
2) interactive authentication session: a complex
(interactive) authentication with iterated tcp/ip requests such as
next-token or new-pin
Crash scenario:
one ore more straight authentication scripts are
running in parallel, emulating parallel ssh authentication sessions
with either right or wrong passcode;
one interactive authentication session:
in the error-originating session the whole user
back-and-forth got his way to the authentication server, that is to say next-token
or new-pin procedure comes to a complete end:
user gives its next token or new pin (twice) and the
strong authentication is fully accomplished and reported by the authentication
manager.
What is wrong is that (by average 2 attempts out of 3)
make openldap crash.
Sasl speaking our server-step mechanism closes fine
and we expect its dispose step to be called by openldap (by the
sasl_server_end)
That is not going to happen being Openldap crashed
meanwhile.
Again this happens only but not always when the
stress scripts are running.
Does it smell of thread concurrency and safety?
Our different use cases tests run almost always successfully
when done individually or in parallel but manually (real people operating).
Conversely when stress test are going on the problem
arise very often but not always.
We had the Opeldap crash both on linux and solaris installations
though on linux the crash happens some steps earlier than with solaris.
On linux we have had the opportunity to rebuild
openldap with some debug instruction inside.
what came out is
connection.c
crashing at:
op->o_tmpfree(
op->orb_edn.bv_val, op->o_tmpmemctx );
in both successful and error cases either the values
passed and the function address look alike.
op->orb_edn.bv_val shows the distinguished name of
the user just logged in (op->orb_edn.bv_len is correctly set with its
length).
Based on the function name we guess ldap is about to
free the DN allocated storage.
On Solaris9 we had the segfault
symptom but we have some more line on the log, one of which shows a almost empty
DN (!!!) made just of uid=””
GDB has the following from the linux core dump:
gdb -c core.5785
GNU gdb Red Hat Linux (6.3.0.0-1.159.el4rh) Copyright
2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General
Public License, and you are welcome to change it and/or distribute copies of it
under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type
"show warranty" for details.
This GDB was configured as
"i386-redhat-linux-gnu".
Core was generated by `/usr/sbin/slapd -h ldap:///'.
Program terminated with signal 11, Segmentation
fault.
#0 0x00268934 in ?? ()
THE FOLLOWING CODE IS AN EXCERPT OF THE FINAL
SERVER_MECH_STEP FUNCTION OF THE MECHANISM WHICH RUN BEHIND THE
SASL_SERVER_STEP ONLY WHEN CRASH HAPPENS:
static int mymech_server_mech_step(void *conn_context
__attribute__((unused)),
sasl_server_params_t *params,
const char *clientin,
unsigned clientinlen,
const char **serverout,
unsigned *serveroutlen,
sasl_out_params_t *oparams)
{
server_context_t
*text = (server_context_t *) conn_context;
*serverout
= NULL;
*serveroutlen
= 0;
oparams->doneflag
= 1;
oparams->mech_ssf
= 0;
oparams->maxoutbuf
= 0;
oparams->encode_context
= NULL;
oparams->encode
= NULL;
oparams->decode_context
= NULL;
oparams->decode
= NULL;
oparams->param_version
= 0;
oparams->authid
= text->authid;
return
SASL_OK;
}
SLAPD.LOG OF THE
CRASH CASE (WITH OUR ADDED SYSLOGS):
Feb 4 12:09:15 yavcentos slapd: SASL [conn=48]
Debug MMYMECH====================== SERVER mymech_server_mech_step END ==
Feb 4 12:09:15 yavcentos mymech: sasl.c label
slap_sasl_authorize
Feb 4 12:09:15 yavcentos slapd: SASL proxy
authorize [conn=48]: authcid="alfredo@yavcentos"
authzid="alfredo@yavcentos"
Feb 4 12:09:15 yavcentos slapd: conn=48 op=4
BIND authcid="alfredo@yavcentos"
authzid="alfredo@yavcentos"
Feb 4 12:09:15 yavcentos slapd: SASL Authorize
[conn=48]: proxy authorization allowed authzDN=""
Feb 4 12:09:15 yavcentos mymech: sasl.c label1
Feb 4 12:09:15 yavcentos mymech: result.c
label2
Feb 4 12:09:15 yavcentos slapd: send_ldap_sasl:
err=0 len=-1
Feb 4 12:09:15 yavcentos mymech: result.c
label6a
Feb 4 12:09:15 yavcentos mymech: result.c
label61
Feb 4 12:09:15 yavcentos mymech: result.c
label61a (tra qui)
Feb 4 12:09:15 yavcentos mymech: result.c
label61b
Feb 4 12:09:15 yavcentos mymech: result.c
label61b2 op->o_callback->sc_response=0x805e1fc op=0x8926cb0 rs=0x17551a0
Feb 4 12:09:15 yavcentos mymech: connection.c
label1
Feb 4 12:09:15 yavcentos mymech: connection.c
label1a
Feb 4 12:09:15 yavcentos mymech: connection.c
label11a
Feb 4 12:09:15 yavcentos mymech: connection.c
label111a
Feb 4 12:09:15 yavcentos mymech: connection.c
label12a
Feb 4 12:09:15 yavcentos mymech: connection.c
label12a op->o_tmpfree=0x809c3e8
op->orb_edn.bv_val=uid=alfredo@yavcentos,ou=
people,dc=my-domain,dc=com op->orb_edn.bv_len=51
op->o_tmpmemctx=0x8926c10 (slab malloc context)
END
SLAPD.LOG OF A
SUCCESSFUL LOGIN CASE (WITH OUR ADDED
SYSLOGS):
Feb 4 12:13:13 yavcentos mymech: sasl.c label
slap_sasl_authorize
Feb 4 12:13:13 yavcentos slapd: SASL proxy
authorize [conn=3]: authcid="alfredo@yavcentos"
authzid="alfredo@yavcentos"
Feb 4 12:13:14 yavcentos slapd: conn=3 op=4
BIND authcid="alfredo@yavcentos"
authzid="alfredo@yavcentos"
Feb 4 12:13:14 yavcentos slapd: SASL Authorize
[conn=3]: proxy authorization allowed authzDN=""
Feb 4 12:13:14 yavcentos mymech: sasl.c label1
Feb 4 12:13:14 yavcentos mymech: result.c
label2
Feb 4 12:13:14 yavcentos slapd: send_ldap_sasl:
err=0 len=-1
Feb 4 12:13:14 yavcentos mymech: result.c
label6a
Feb 4 12:13:14 yavcentos mymech: result.c
label61
Feb 4 12:13:14 yavcentos mymech: result.c
label61a (tra qui)
Feb 4 12:13:14 yavcentos mymech: result.c
label61b
Feb 4 12:13:14 yavcentos mymech: result.c
label61b2 op->o_callback->sc_response=0x805e1fc op=0x86667b8 rs=0x51271a0
Feb 4 12:13:14 yavcentos mymech: connection.c
label1
Feb 4 12:13:14 yavcentos mymech: connection.c
label1a
Feb 4 12:13:14 yavcentos mymech: connection.c
label11a
Feb 4 12:13:14 yavcentos mymech: connection.c
label111a
Feb 4 12:13:14 yavcentos mymech: connection.c
label12a
Feb 4 12:13:14 yavcentos mymech: connection.c
label12a op->o_tmpfree=0x809c3e8
op->orb_edn.bv_val=uid=alfredo@yavcentos,ou=
people,dc=my-domain,dc=com op->orb_edn.bv_len=51
op->o_tmpmemctx=0x86660c0 (slab malloc context)
Feb 4 12:13:14 yavcentos mymech: connection.c
label13a
Feb 4 12:13:14 yavcentos mymech: connection.c
label14a
Feb 4 12:13:14 yavcentos mymech: connection.c
label15a
Feb 4 12:13:14 yavcentos mymech: connection.c
label151a
Feb 4 12:13:14 yavcentos mymech: connection.c
label151b
Feb 4 12:13:14 yavcentos mymech: connection.c
label2
Feb 4 12:13:14 yavcentos slapd: conn=3 op=4
BIND dn="uid=alfredo@yavcentos,ou=people,dc=my-domain,dc=com" mech=MYMECH
sasl_
ssf=0 ssf=0
Feb 4 12:13:14 yavcentos slapd: do_bind: SASL/MYMECH
bind: dn="uid=alfredo@yavcentos,ou=people,dc=my-domain,dc=com"
sasl_ss
f=0
Feb 4 12:13:14 yavcentos mymech: connection.c
label3
Feb 4 12:13:14 yavcentos mymech: result.c
label61c
Feb 4 12:13:14 yavcentos mymech: result.c
label61d
Feb 4 12:13:14 yavcentos mymech: result.c
label61d2
Feb 4 12:13:14 yavcentos mymech: result.c
label61e
Feb 4 12:13:14 yavcentos slapd: send_ldap_response:
msgid=5 tag=97 err=0
Feb 4 12:13:14 yavcentos mymech: result.c
label1
Feb 4 12:13:14 yavcentos mymech: result.c
label11
Feb 4 12:13:14 yavcentos mymech: result.c
label12
Feb 4 12:13:14 yavcentos mymech: result.c
label13
Feb 4 12:13:14 yavcentos mymech: result.c
label14
Feb 4 12:13:14 yavcentos mymech: result.c
label14e
Feb 4 12:13:14 yavcentos mymech: result.c
label14f
Feb 4 12:13:14 yavcentos mymech: result.c
label14g
Feb 4 12:13:14 yavcentos mymech: result.c
label14h
Feb 4 12:13:14 yavcentos mymech: result.c
label15
Feb 4 12:13:14 yavcentos mymech: result.c
label16
Feb 4 12:13:14 yavcentos mymech: result.c
label18
Feb 4 12:13:14 yavcentos mymech: result.c
label19
Feb 4 12:13:14 yavcentos mymech: result.c
label20
Feb 4 12:13:14 yavcentos mymech: result.c
label21
Feb 4 12:13:14 yavcentos mymech: result.c
label6b e qui2
Feb 4 12:13:14 yavcentos mymech: daemon:
activity on 1 descriptor
Feb 4 12:13:15 yavcentos slapd: daemon:
activity on:
Feb 4 12:13:15 yavcentos slapd: 13r
Feb 4 12:13:15 yavcentos slapd:
Feb 4 12:13:15 yavcentos slapd: daemon: read
active on 13
Feb 4 12:13:15 yavcentos slapd:
connection_get(13)
Feb 4 12:13:15 yavcentos slapd:
connection_get(13): got connid=3
Feb 4 12:13:15 yavcentos slapd:
connection_read(13): checking for input on id=3
Feb 4 12:13:15 yavcentos slapd: daemon: epoll:
listen=7 active_threads=0 tvp=NULL
Feb 4 12:13:15 yavcentos slapd: daemon: epoll:
listen=8 active_threads=0 tvp=NULL
Feb 4 12:13:15 yavcentos slapd: conn=3 op=4
RESULT tag=97 err=0 text=
Feb 4 12:13:15 yavcentos mymech: sasl.c label4
Feb 4 12:13:15 yavcentos slapd: <==
slap_sasl_bind: rc=0
Feb 4 12:13:15 yavcentos mymech: bind.c do_bind
label1
Feb 4 12:13:15 yavcentos slapd: do_bind
Feb 4 12:13:15 yavcentos slapd: conn=3 op=5
BIND anonymous mech=implicit ssf=0
Feb 4 12:13:15 yavcentos slapd: >>>
dnPrettyNormal: <>
Feb 4 12:13:15 yavcentos slapd: <<<
dnPrettyNormal: <>, <>
Feb 4 12:13:15 yavcentos slapd: do_bind: version=3
dn="" method=128
Feb 4 12:13:15 yavcentos slapd: conn=3 op=5
BIND dn="" method=128
Feb 4 12:13:15 yavcentos mymech: sasl.c label
slap_sasl_reset
Feb 4 12:13:15 yavcentos slapd:
send_ldap_result: conn=3 op=5 p=3
Feb 4 12:13:15 yavcentos slapd: send_ldap_result:
err=0 matched="" text=""
Feb 4 12:13:15 yavcentos mymech: result.c
label22a
Feb 4 12:13:15 yavcentos mymech: result.c
label61
Feb 4 12:13:15 yavcentos mymech: result.c
label61a (tra qui)
Feb 4 12:13:15 yavcentos mymech: result.c
label61b
Feb 4 12:13:15 yavcentos mymech: result.c
label61b2 op->o_callback->sc_response=0x805e1fc op=0x8668158 rs=0x2afc1a0
Feb 4 12:13:15 yavcentos mymech: connection.c
label1
Feb 4 12:13:15 yavcentos mymech: connection.c
label3
Feb 4 12:13:15 yavcentos mymech: result.c
label61c
Feb 4 12:13:15 yavcentos mymech: result.c
label61d
Feb 4 12:13:15 yavcentos mymech: result.c
label61d2
Feb 4 12:13:15 yavcentos mymech: result.c
label61e
Feb 4 12:13:15 yavcentos slapd:
send_ldap_response: msgid=6 tag=97 err=0
Feb 4 12:13:15 yavcentos mymech: result.c
label1
Feb 4 12:13:15 yavcentos mymech: result.c
label11
Feb 4 12:13:15 yavcentos mymech: result.c
label12
Feb 4 12:13:15 yavcentos mymech: result.c
label13
Feb 4 12:13:15 yavcentos mymech: result.c
label14
Feb 4 12:13:15 yavcentos mymech: result.c
label14e
Feb 4 12:13:15 yavcentos mymech: result.c
label14f
Feb 4 12:13:15 yavcentos mymech: result.c
label14g
Feb 4 12:13:15 yavcentos mymech: result.c
label14h
Feb 4 12:13:15 yavcentos mymech: result.c
label15
Feb 4 12:13:15 yavcentos mymech: result.c
label16
Feb 4 12:13:15 yavcentos mymech: result.c
label18
Feb 4 12:13:15 yavcentos mymech: result.c
label19
Feb 4 12:13:16 yavcentos mymech: result.c
label20
Feb 4 12:13:16 yavcentos mymech: result.c
label21
Feb 4 12:13:16 yavcentos mymech: result.c
label22b
Feb 4 12:13:16 yavcentos mymech: result.c
label5 e qui1
Feb 4 12:13:16 yavcentos slapd: conn=3 op=5
RESULT tag=97 err=0 text=
Feb 4 12:13:16 yavcentos slapd: do_bind: v3
anonymous bind
Feb 4 12:13:15 yavcentos mymech: daemon:
activity on 1 descriptor
Feb 4 12:13:16 yavcentos slapd: daemon:
activity on:
Feb 4 12:13:16 yavcentos slapd: 13r
Feb 4 12:13:16 yavcentos slapd:
Feb 4 12:13:16 yavcentos slapd: daemon: read
active on 13
Feb 4 12:13:16 yavcentos slapd:
connection_get(13)
Feb 4 12:13:16 yavcentos slapd:
connection_get(13): got connid=3
Feb 4 12:13:16 yavcentos slapd:
connection_read(13): checking for input on id=3
Feb 4 12:13:16 yavcentos slapd: ber_get_next on
fd 13 failed errno=0 (Success)
Feb 4 12:13:16 yavcentos slapd:
connection_read(13): input error=-2 id=3, closing.
Feb 4 12:13:16 yavcentos slapd:
connection_closing: readying conn=3 sd=13 for close
Feb 4 12:13:16 yavcentos slapd:
connection_close: conn=3 sd=-1
Feb 4 12:13:16 yavcentos mymech: sasl.c label
slap_sasl_close
Feb 4 12:13:16 yavcentos mymech: sasl.c label
slap_sasl_log
Feb 4 12:13:16 yavcentos slapd: SASL [conn=-1]
Debug: MMYMECHSERVER mymech_server_mech_dispose
Feb 4 12:13:16 yavcentos mymech: sasl.c label
slap_sasl_log
Feb 4 12:13:16 yavcentos slapd: SASL [conn=-1]
Error: MMYMECHSERVER mymech_server_mech_dispose: start - receives
*conn_co
ntext=0x8666a10 *utils=0x8666188
Feb 4 12:13:16 yavcentos mymech: sasl.c label
slap_sasl_log
Feb 4 12:13:17 yavcentos slapd: SASL [conn=-1]
Error: MMYMECHSERVER mymech_server_mech_dispose
Feb 4 12:13:17 yavcentos slapd: daemon:
removing 13
Feb 4 12:13:17 yavcentos slapd: conn=3 fd=13
closed (connection lost)
Feb 4 12:13:17 yavcentos slapd: daemon: epoll:
listen=7 active_threads=0 tvp=NULL
Feb 4 12:13:17 yavcentos slapd: daemon: epoll:
listen=8 active_threads=0 tvp=NULL
Feb 4 12:13:17 yavcentos slapd: daemon:
activity on 1 descriptor
Feb 4 12:13:17 yavcentos slapd: daemon:
activity on:
Feb 4 12:13:17 yavcentos slapd:
Feb 4 12:13:17 yavcentos slapd: daemon: epoll:
listen=7 active_threads=0 tvp=NULL
Feb 4 12:13:17 yavcentos slapd: daemon: epoll:
listen=8 active_threads=0 tvp=NULL
END
Compile and link options of mymech:
gcc –I ..... -v -Wall -W -O2 -MT mymech.lo -MD
-MP -MF .deps/mymech.Tpo -c mymech.c -fPIC -DPIC -o mymech.lo
gcc -shared mymech.lo mymech _init.lo
plugin_common.lo -lpthread -lcrypt -lresolv -lc -Wl,-soname -Wl,lib mymech.so.2
-o .libs/lib mymech.so.2.0.22
Would you have a clue or a though for us about some
further check to do over either our code or compile options,
(maybe particularly on the thread-safe topic)?
Would you suggest some thread safe programming
strategy?
Many many thanks
Francesco Grossi