[I’ve posted this on the OpenStack list as well, but maybe someone here knows more]
I’m setting up (Open)LDAP (v2.4.40) on my old Newton installation, with the LDAP servers behind a HAProxy LB.
I’m trying to have one at a time enabled to see if I can get them working individually before I try them as a whole/group..
I tried all day yesterday, and I could do the initial connection, but not get any results:
ldap_sasl_interactive_bind_s: Can't contact LDAP server (-1)
I see the connection in syslog on the LDAP server, but don’t get any results back.
Now, first thing I did this morning was to just run the exact same command (kinit && ldapwhoami) that I did last night.
AND IT WORKED!!
No idea why! It shouldn’t have. Glad it did, but since I can’t explain WHY it worked, it’s annoying!! :)
So I then disabled that (working) LDAP server in the LB member list and enabled the second. And now that is experiencing the same problem as the first yesterday…
I didn’t change anything else - last thing I did before I went to bed last night was try the ldapwhoami command -> “can’t contact ldap server”. And the very first thing I did this morning was kdestroy my ticket, get a new one and then run ldapwhoami.
I’ve run with multiple types of debugging, but there’s nothing obvious that I can see, either from ‘-d -1’ or with KRB5_TRACE set).
So … “something” internally in OS changed. Any suggestions to what or how to debug this?
What is ldap_sasl_interactive_bind_s() actually doing? Why does the ldap_bind() earlier seem to work, but not the SASL bind?
See http://bayour.com/misc/ldapwhoami_output.txt http://bayour.com/misc/ldapwhoami_output.txt for full output from
KRB5_TRACE=/dev/stdout ldapwhoami -YGSSAPI -H ldaps://ldap.bayour.net -d -1
and while this is happening, this is the output from slapd in the logs (running with “loglevel sync stats):
Nov 19 12:42:40 admin-auth-ldap-31 slapd[26613]: conn=1015 fd=29 ACCEPT from IP=10.0.17.34:53451 (IP=10.0.17.31:636) Nov 19 12:42:40 admin-auth-ldap-31 slapd[26613]: conn=1015 fd=29 TLS established tls_ssf=256 ssf=256 Nov 19 12:42:40 admin-auth-ldap-31 slapd[26613]: conn=1015 op=0 BIND dn="" method=163 Nov 19 12:43:09 admin-auth-ldap-31 slapd[26613]: conn=1013 fd=22 closed (connection lost)
With ‘loglevel -1’ (and filtering out 'daemon: epoll: listen|daemon: activity on’ because it ends up filling the screen), I get:
Nov 19 12:49:28 admin-auth-ldap-31 slapd[27043]: Nov 19 12:49:28 admin-auth-ldap-31 slapd[27043]: slap_listener_activate(12): Nov 19 12:49:28 admin-auth-ldap-31 slapd[27043]: >>> slap_listener(ldaps://admin-auth-ldap-31.bayour.net:636/) Nov 19 12:49:28 admin-auth-ldap-31 slapd[27043]: daemon: listen=12, new connection on 25 Nov 19 12:49:29 admin-auth-ldap-31 slapd[27043]: Nov 19 12:49:33 admin-auth-ldap-31 slapd[27043]: daemon: added 25r (active) listener=(nil) Nov 19 12:49:33 admin-auth-ldap-31 slapd[27043]: conn=1001 fd=25 ACCEPT from IP=10.0.17.34:54740 (IP=10.0.17.31:636) Nov 19 12:49:34 admin-auth-ldap-31 slapd[27043]: 25r Nov 19 12:49:34 admin-auth-ldap-31 slapd[27043]: Nov 19 12:49:34 admin-auth-ldap-31 slapd[27043]: daemon: read active on 25 Nov 19 12:49:34 admin-auth-ldap-31 slapd[27043]: connection_get(25) Nov 19 12:49:34 admin-auth-ldap-31 slapd[27043]: connection_get(25): got connid=1001 Nov 19 12:49:34 admin-auth-ldap-31 slapd[27043]: connection_read(25): checking for input on id=1001 Nov 19 12:49:35 admin-auth-ldap-31 slapd[27043]: 25r Nov 19 12:49:35 admin-auth-ldap-31 slapd[27043]: Nov 19 12:49:35 admin-auth-ldap-31 slapd[27043]: daemon: read active on 25 Nov 19 12:49:35 admin-auth-ldap-31 slapd[27043]: connection_get(25) Nov 19 12:49:35 admin-auth-ldap-31 slapd[27043]: connection_get(25): got connid=1001 Nov 19 12:49:35 admin-auth-ldap-31 slapd[27043]: connection_read(25): checking for input on id=1001 Nov 19 12:49:35 admin-auth-ldap-31 slapd[27043]: connection_read(25): unable to get TLS client DN, error=49 id=1001 Nov 19 12:49:35 admin-auth-ldap-31 slapd[27043]: conn=1001 fd=25 TLS established tls_ssf=256 ssf=256 Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: 25r Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: daemon: read active on 25 Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: connection_get(25) Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: connection_get(25): got connid=1001 Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: connection_read(25): checking for input on id=1001 Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: op tag 0x60, time 1511095776 Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: conn=1001 op=0 do_bind Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: >>> dnPrettyNormal: <> Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: <<< dnPrettyNormal: <>, <> Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: conn=1001 op=0 BIND dn="" method=163 Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: do_bind: dn () SASL mech GSSAPI Nov 19 12:49:36 admin-auth-ldap-31 slapd[27043]: ==> sasl_bind: dn="" mech=GSSAPI datalen=617 Nov 19 12:49:37 admin-auth-ldap-31 slapd[27043]: Nov 19 12:49:54 admin-auth-ldap-31 slapd[27043]: Nov 19 12:49:55 admin-auth-ldap-31 slapd[27043]: Nov 19 12:50:26 admin-auth-ldap-31 slapd[27043]: 25r Nov 19 12:50:26 admin-auth-ldap-31 slapd[27043]: Nov 19 12:50:26 admin-auth-ldap-31 slapd[27043]: daemon: read active on 25 Nov 19 12:50:26 admin-auth-ldap-31 slapd[27043]: connection_get(25) Nov 19 12:50:26 admin-auth-ldap-31 slapd[27043]: connection_get(25): got connid=1001 Nov 19 12:50:26 admin-auth-ldap-31 slapd[27043]: connection_read(25): checking for input on id=1001 Nov 19 12:50:26 admin-auth-ldap-31 slapd[27043]: ber_get_next on fd 25 failed errno=0 (Success) Nov 19 12:50:26 admin-auth-ldap-31 slapd[27043]: connection_read(25): input error=-2 id=1001, closing. Nov 19 12:50:26 admin-auth-ldap-31 slapd[27043]: connection_closing: readying conn=1001 sd=25 for close Nov 19 12:50:26 admin-auth-ldap-31 slapd[27043]: connection_close: deferring conn=1001 sd=25 Nov 19 12:50:27 admin-auth-ldap-31 slapd[27043]: Nov 19 12:50:28 admin-auth-ldap-31 slapd[27043]:
So nothing obvious that I can see. Which is reasonable, because “eventually” it worked on the previous LDAP server, so can’t be a slapd problem. But I was hoping someone that have tried this on OS or behind a HAProxy setup might be able to shed some light on this.
PS. I’ve done the exact same thing at work, in AWS and there it works just fine. So I’m fairly certain it’s something with OS/HAProxy, but I don’t know how to debug that bit..