Hello,
I am facing an issue with syncrepl and STARTTLS on 389 port. The kind of problem happening only sometimes, and disappearing "by itself". I use Debian Jessie, OpenLDAP 2.4.40+dfsg-1+deb8u2.
Description : I have a production environment, and a preproduction environment. Both of them consist of one provider and multiple consumers.
Right now all is working fine. Then 2 times a days a script injects the production data in the preproduction environment : ldapsearch on prod / ldapdelete on preprod / stop slapd on preprod / slapadd on preprod / start slapd on preprod.
Then sometimes during this injection I get a problem : one or more of the consumers in preprod fail to sync. But sometimes all are OK. Then the consumers facing the issue succeed syncrepl again "by themselves" after a few dozens of minutes / hours. If I restart slapd on the failing consumers, then the syncrepl succeeds immediately. When the syncrepl is failing, I can still perform an ldapsearch using STARTTLS and the syncrepl binding credentials from the consumer to the provider. There is no issue with non-TLS LDAP replications.
Also, using ldapsearch with option "-d 3" I get TLS debug, but even with "olcLogLevel -1" in cn=config I can't get precise debug for TLS during syncrepl.
I captured the trafic and when there is the error, wireshark at some point can't understand TLS sent by the consumer (SSL "ignored unknown record"). Corresponding data is "30 05 02 01 02 42 00". Not sure it's related to the problem though, but it looks like since (if I am right) the first information in TLS is the content-type, and 30 is not a valid value for content type (https://www.iana.org/assignments/tls-parameters/tls-parameters.xhtml#tls-par...).
Any idea ? I would like to avoid making the injection script connect to all my preprod consumers and restart slapd on them. Please feel free to ask for more logs / configurations.
Log on the provider : May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on 1 descriptor May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on: May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: slap_listener_activate(9): May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=9 busy May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=10 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=11 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=12 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=13 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: >>> slap_listener(ldap:///) May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on 1 descriptor May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: listen=9, new connection on 18 May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on: May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: added 18r (active) listener=(nil) May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: conn=1010 fd=18 ACCEPT from IP=192.168.115.61:33816 (IP=0.0.0.0:389) May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=9 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=10 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=11 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=12 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=13 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on 1 descriptor May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on: May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: 18r May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: read active on 18 May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=9 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_get(18) May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=10 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=11 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=12 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=13 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_get(18): got connid=1010 May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_read(18): checking for input on id=1010
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: op tag 0x77, time 1496231248 May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: conn=1010 op=0 do_extended May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on 1 descriptor May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on: May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: conn=1010 op=0 EXT oid=1.3.6.1.4.1.1466.20037 May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: do_extended: oid=1.3.6.1.4.1.1466.20037 May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=9 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: conn=1010 op=0 STARTTLS May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=10 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: send_ldap_extended: err=0 oid= len=0 May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: send_ldap_response: msgid=1 tag=120 err=0 May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=11 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: conn=1010 op=0 RESULT oid= err=0 text= May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=12 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=13 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on 1 descriptor May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on: May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: 18r May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: read active on 18 May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=9 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_get(18) May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=10 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_get(18): got connid=1010 May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=11 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_read(18): checking for input on id=1010 May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=12 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=13 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on 1 descriptor May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on: May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=9 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=10 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=11 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=12 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=13 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on 1 descriptor May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on: May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: 18r May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: read active on 18 May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=9 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_get(18) May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=10 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_get(18): got connid=1010 May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=11 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_read(18): checking for input on id=1010 May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=12 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=13 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_read(18): unable to get TLS client DN, error=49 id=1010 May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: conn=1010 fd=18 TLS established tls_ssf=128 ssf=128 May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on 2 descriptors May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on: May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: 18r May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: read active on 18 May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=9 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=10 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=11 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=12 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_get(18) May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=13 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_get(18): got connid=1010 May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_read(18): checking for input on id=1010 May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: ber_get_next on fd 18 failed errno=0 (Success) May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_read(18): input error=-2 id=1010, closing. May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_closing: readying conn=1010 sd=18 for close May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_close: conn=1010 sd=18 May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on 1 descriptor May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: removing 18 May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on: May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: conn=1010 fd=18 closed (connection lost) May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=9 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=10 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=11 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=12 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=13 active_threads=0 tvp=zero
Log on the consumer : May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: daemon: epoll: listen=9 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: daemon: epoll: listen=10 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: =>do_syncrepl rid=006 May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: daemon: epoll: listen=11 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: slap_client_connect: URI=ldap://ldap-master-pp.acme Error, ldap_start_tls failed (-11) May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: do_syncrepl: rid=006 rc -11 retrying May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: daemon: activity on 1 descriptor May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: daemon: activity on: May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: daemon: epoll: listen=9 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: daemon: epoll: listen=10 active_threads=0 tvp=zero May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: daemon: epoll: listen=11 active_threads=0 tvp=zero
syncrepl config on consumer : olcSyncrepl: {0}rid=6 provider=ldap://ldap-master-pp.acme starttls=critical t ls_reqcert=never bindmethod=simple binddn="cn=replication,ou=Applications,d c=acme,dc=fr" credentials=**** searchbase="dc=acme,dc=fr" schemacheck ing=off type=refreshAndPersist filter="(objectClass=*)" attrs="*" scope=sub retry="60 +"
TLS config on provider :
dn: cn=config objectClass: olcGlobal cn: config olcArgsFile: /var/run/slapd/slapd.args olcPidFile: /var/run/slapd/slapd.pid olcToolThreads: 1 structuralObjectClass: olcGlobal entryUUID: 9efabaca-b31f-1036-9746-2d4f8feb88f1 creatorsName: cn=config createTimestamp: 20170411162820Z olcTLSCertificateFile: /etc/ssl/certs/ssl-cert-snakeoil.pem olcTLSCertificateKeyFile: /etc/ssl/private/ssl-cert-snakeoil.key olcLogLevel: 256 entryCSN: 20170531131949.877388Z#000000#000#000000 modifiersName: gidNumber=0+uidNumber=0,cn=peercred,cn=external,cn=auth modifyTimestamp: 20170531131949Z