Hello,

I am facing an issue with syncrepl and STARTTLS on 389 port. The kind of problem happening only sometimes, and disappearing "by itself". I use Debian Jessie, OpenLDAP 2.4.40+dfsg-1+deb8u2.

Description : I have a production environment, and a preproduction environment. Both of them consist of one provider and multiple consumers.

Right now all is working fine. Then 2 times a days a script injects the production data in the preproduction environment : ldapsearch on prod / ldapdelete on preprod / stop slapd on preprod / slapadd on preprod / start slapd on preprod.

Then sometimes during this injection I get a problem : one or more of the consumers in preprod fail to sync. But sometimes all are OK. Then the consumers facing the issue succeed syncrepl again "by themselves" after a few dozens of minutes / hours. If I restart slapd on the failing consumers, then the syncrepl succeeds immediately. When the syncrepl is failing, I can still perform an ldapsearch using STARTTLS and the syncrepl binding credentials  from the consumer to the provider. There is no issue with non-TLS LDAP replications.

Also, using ldapsearch with option "-d 3" I get TLS debug, but even with "olcLogLevel -1" in cn=config I can't get precise debug for TLS during syncrepl.

I captured the trafic and when there is the error, wireshark at some point can't understand TLS sent by the
consumer (SSL "ignored unknown record"). Corresponding data is "30 05 02 01 02 42 00". Not sure it's related to the problem though, but it looks like since (if I am right) the first information in TLS is the content-type, and 30 is not a valid value for content type (https://www.iana.org/assignments/tls-parameters/tls-parameters.xhtml#tls-parameters-5).

Any idea ? I would like to avoid making the injection script connect to all my preprod consumers and restart slapd on them. Please feel free to ask for more logs / configurations.

Log on the provider :
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on 1 descriptor
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on:
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]:
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: slap_listener_activate(9):
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=9 busy
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=10 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=11 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=12 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=13 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: >>> slap_listener(ldap:///)
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on 1 descriptor
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: listen=9, new connection on 18
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on:
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: added 18r (active) listener=(nil)
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]:
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: conn=1010 fd=18 ACCEPT from IP=192.168.115.61:33816 (IP=0.0.0.0:389)
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=9 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=10 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=11 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=12 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=13 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on 1 descriptor
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on:
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]:  18r
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]:
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: read active on 18
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=9 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_get(18)
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=10 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=11 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=12 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=13 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_get(18): got connid=1010
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_read(18): checking for input on id=1010

May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: op tag 0x77, time 1496231248
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: conn=1010 op=0 do_extended
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on 1 descriptor
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on:
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: conn=1010 op=0 EXT oid=1.3.6.1.4.1.1466.20037
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]:
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: do_extended: oid=1.3.6.1.4.1.1466.20037
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=9 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: conn=1010 op=0 STARTTLS
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=10 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: send_ldap_extended: err=0 oid= len=0
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: send_ldap_response: msgid=1 tag=120 err=0
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=11 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: conn=1010 op=0 RESULT oid= err=0 text=
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=12 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=13 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on 1 descriptor
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on:
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]:  18r
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]:
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: read active on 18
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=9 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_get(18)
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=10 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_get(18): got connid=1010
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=11 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_read(18): checking for input on id=1010
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=12 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=13 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on 1 descriptor
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on:
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]:
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=9 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=10 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=11 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=12 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=13 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on 1 descriptor
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on:
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]:  18r
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]:
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: read active on 18
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=9 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_get(18)
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=10 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_get(18): got connid=1010
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=11 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_read(18): checking for input on id=1010
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=12 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=13 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_read(18): unable to get TLS client DN, error=49 id=1010
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: conn=1010 fd=18 TLS established tls_ssf=128 ssf=128
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on 2 descriptors
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on:
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]:  18r
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]:
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: read active on 18
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=9 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=10 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=11 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=12 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_get(18)
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=13 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_get(18): got connid=1010
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_read(18): checking for input on id=1010
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: ber_get_next on fd 18 failed errno=0 (Success)
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_read(18): input error=-2 id=1010, closing.
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_closing: readying conn=1010 sd=18 for close
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: connection_close: conn=1010 sd=18
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on 1 descriptor
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: removing 18
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: activity on:
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: conn=1010 fd=18 closed (connection lost)
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]:
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=9 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=10 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=11 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=12 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-master-pp slapd[5828]: daemon: epoll: listen=13 active_threads=0 tvp=zero

Log on the consumer :
May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: daemon: epoll: listen=9 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: daemon: epoll: listen=10 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: =>do_syncrepl rid=006
May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: daemon: epoll: listen=11 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: slap_client_connect: URI=ldap://ldap-master-pp.acme Error, ldap_start_tls failed (-11)
May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: do_syncrepl: rid=006 rc -11 retrying
May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: daemon: activity on 1 descriptor
May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: daemon: activity on:
May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]:
May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: daemon: epoll: listen=9 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: daemon: epoll: listen=10 active_threads=0 tvp=zero
May 31 13:47:28 lxc-sd-ldap-replica-pp-prim slapd[32753]: daemon: epoll: listen=11 active_threads=0 tvp=zero


syncrepl config on consumer :
olcSyncrepl: {0}rid=6 provider=ldap://ldap-master-pp.acme starttls=critical t
 ls_reqcert=never bindmethod=simple binddn="cn=replication,ou=Applications,d
 c=acme,dc=fr" credentials=**** searchbase="dc=acme,dc=fr" schemacheck
 ing=off type=refreshAndPersist filter="(objectClass=*)" attrs="*" scope=sub
  retry="60 +"

TLS config on provider :

dn: cn=config
objectClass: olcGlobal
cn: config
olcArgsFile: /var/run/slapd/slapd.args
olcPidFile: /var/run/slapd/slapd.pid
olcToolThreads: 1
structuralObjectClass: olcGlobal
entryUUID: 9efabaca-b31f-1036-9746-2d4f8feb88f1
creatorsName: cn=config
createTimestamp: 20170411162820Z
olcTLSCertificateFile: /etc/ssl/certs/ssl-cert-snakeoil.pem
olcTLSCertificateKeyFile: /etc/ssl/private/ssl-cert-snakeoil.key
olcLogLevel: 256
entryCSN: 20170531131949.877388Z#000000#000#000000
modifiersName: gidNumber=0+uidNumber=0,cn=peercred,cn=external,cn=auth
modifyTimestamp: 20170531131949Z