ldap_sasl_interactive_bind_s: Can't contact LDAP server - openldap-technical

19 Nov 2017


      [I’ve posted this on the OpenStack list as well, but maybe someone
 here knows more]
I’m setting up (Open)LDAP (v2.4.40) on my old Newton installation,
with the LDAP servers behind a HAProxy LB.
I’m trying to have one at a time enabled to see if I can get them
working individually before I try them as a whole/group..
I tried all day yesterday, and I could do the initial connection, but
not get any results:
ldap_sasl_interactive_bind_s: Can't contact LDAP server (-1)
I see the connection in syslog on the LDAP server, but don’t get any
results back.
Now, first thing I did this morning was to just run the exact same
command (kinit && ldapwhoami) that I did last night.
AND IT WORKED!!
No idea why! It shouldn’t have. Glad it did, but since I can’t explain
WHY it worked, it’s annoying!! :)
So I then disabled that (working) LDAP server in the LB member list
and enabled the second. And now that is experiencing the same
problem as the first yesterday…
I didn’t change anything else - last thing I did before I went to bed
last night was try the ldapwhoami command -> “can’t contact ldap
server”. And the very first thing I did this morning was kdestroy
my ticket, get a new one and then run ldapwhoami.
I’ve run with multiple types of debugging, but there’s nothing obvious
that I can see, either from ‘-d -1’ or with KRB5_TRACE set).
So … “something” internally in OS changed. Any suggestions to what
or how to debug this?
What is ldap_sasl_interactive_bind_s() actually doing? Why does the
ldap_bind() earlier seem to work, but not the SASL bind?
See http://bayour.com/misc/ldapwhoami_output.txt http://bayour.com/misc/ldapwhoami_output.txt for full output from
KRB5_TRACE=/dev/stdout ldapwhoami -YGSSAPI -H ldaps://ldap.bayour.net -d -1
and while this is happening, this is the output from slapd in the logs
(running with “loglevel sync stats):
Nov 19 12:42:40 admin-auth-ldap-31 slapd[26613]: conn=1015 fd=29 ACCEPT from IP=10.0.17.34:53451 (IP=10.0.17.31:636)
    Nov 19 12:42:40 admin-auth-ldap-31 slapd[26613]: conn=1015 fd=29 TLS established tls_ssf=256 ssf=256
    Nov 19 12:42:40 admin-auth-ldap-31 slapd[26613]: conn=1015 op=0 BIND dn="" method=163
    Nov 19 12:43:09 admin-auth-ldap-31 slapd[26613]: conn=1013 fd=22 closed (connection lost)
With ‘loglevel -1’ (and filtering out 'daemon: epoll: listen|daemon: activity on’
because it ends up filling the screen), I get:
Nov 19 12:49:28 admin-auth-ldap-31 slapd[27043]:
    Nov 19 12:49:28 admin-auth-ldap-31 slapd[27043]: slap_listener_activate(12):
    Nov 19 12:49:28 admin-auth-ldap-31 slapd[27043]: >>> slap_listener(ldaps://admin-auth-ldap-31.bayour.net:636/)
    Nov 19 12:49:28 admin-auth-ldap-31 slapd[27043]: daemon: listen=12, new connection on 25
    Nov 19 12:49:29 admin-auth-ldap-31 slapd[27043]:
    Nov 19 12:49:33 admin-auth-ldap-31 slapd[27043]: daemon: added 25r (active) listener=(nil)
    Nov 19 12:49:33 admin-auth-ldap-31 slapd[27043]: conn=1001 fd=25 ACCEPT from IP=10.0.17.34:54740 (IP=10.0.17.31:636)
    Nov 19 12:49:34 admin-auth-ldap-31 slapd[27043]:  25r
    Nov 19 12:49:34 admin-auth-ldap-31 slapd[27043]:
    Nov 19 12:49:34 admin-auth-ldap-31 slapd[27043]: daemon: read active on 25
    Nov 19 12:49:34 admin-auth-ldap-31 slapd[27043]: connection_get(25)
    Nov 19 12:49:34 admin-auth-ldap-31 slapd[27043]: connection_get(25): got connid=1001
    Nov 19 12:49:34 admin-auth-ldap-31 slapd[27043]: connection_read(25): checking for input on id=1001
    Nov 19 12:49:35 admin-auth-ldap-31 slapd[27043]:  25r
    Nov 19 12:49:35 admin-auth-ldap-31 slapd[27043]:
    Nov 19 12:49:35 admin-auth-ldap-31 slapd[27043]: daemon: read active on 25
    Nov 19 12:49:35 admin-auth-ldap-31 slapd[27043]: connection_get(25)
    Nov 19 12:49:35 admin-auth-ldap-31 slapd[27043]: connection_get(25): got connid=1001
    Nov 19 12:49:35 admin-auth-ldap-31 slapd[27043]: connection_read(25): checking for input on id=1001
    Nov 19 12:49:35 admin-auth-ldap-31 slapd[27043]: connection_read(25): unable to get TLS client DN, error=49 id=1001
    Nov 19 12:49:35 admin-auth-ldap-31 slapd[27043]: conn=1001 fd=25 TLS established tls_ssf=256 ssf=256
    Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]:  25r
    Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]:
    Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: daemon: read active on 25
    Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: connection_get(25)
    Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: connection_get(25): got connid=1001
    Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: connection_read(25): checking for input on id=1001
    Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: op tag 0x60, time 1511095776
    Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: conn=1001 op=0 do_bind
    Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: >>> dnPrettyNormal: <>
    Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: <<< dnPrettyNormal: <>, <>
    Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: conn=1001 op=0 BIND dn="" method=163
    Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: do_bind: dn () SASL mech GSSAPI
    Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: ==> sasl_bind: dn="" mech=GSSAPI datalen=617
    Nov 19 12:49:37 admin-auth-ldap-31 slapd[27043]:
    Nov 19 12:49:54 admin-auth-ldap-31 slapd[27043]:
    Nov 19 12:49:55 admin-auth-ldap-31 slapd[27043]:
    Nov 19 12:50:26 admin-auth-ldap-31 slapd[27043]:  25r
    Nov 19 12:50:26 admin-auth-ldap-31 slapd[27043]:
    Nov 19 12:50:26 admin-auth-ldap-31 slapd[27043]: daemon: read active on 25
    Nov 19 12:50:26 admin-auth-ldap-31 slapd[27043]: connection_get(25)
    Nov 19 12:50:26 admin-auth-ldap-31 slapd[27043]: connection_get(25): got connid=1001
    Nov 19 12:50:26 admin-auth-ldap-31 slapd[27043]: connection_read(25): checking for input on id=1001
    Nov 19 12:50:26 admin-auth-ldap-31 slapd[27043]: ber_get_next on fd 25 failed errno=0 (Success)
    Nov 19 12:50:26 admin-auth-ldap-31 slapd[27043]: connection_read(25): input error=-2 id=1001, closing.
    Nov 19 12:50:26 admin-auth-ldap-31 slapd[27043]: connection_closing: readying conn=1001 sd=25 for close
    Nov 19 12:50:26 admin-auth-ldap-31 slapd[27043]: connection_close: deferring conn=1001 sd=25
    Nov 19 12:50:27 admin-auth-ldap-31 slapd[27043]:
    Nov 19 12:50:28 admin-auth-ldap-31 slapd[27043]:
So nothing obvious that I can see. Which is reasonable, because
“eventually” it worked on the previous LDAP server, so can’t be
a slapd problem. But I was hoping someone that have tried this
on OS or behind a HAProxy setup might be able to shed some
light on this.
PS. I’ve done the exact same thing at work, in AWS and there it
       works just fine. So I’m fairly certain it’s something with OS/HAProxy,
       but I don’t know how to debug that bit..