https://bugs.openldap.org/show_bug.cgi?id=9533
Issue ID: 9533 Summary: OpenLdap hangs when creating many databases Product: OpenLDAP Version: 2.4.57 Hardware: All OS: All Status: UNCONFIRMED Severity: normal Priority: --- Component: slapd Assignee: bugs@openldap.org Reporter: akrush24@gmail.com Target Milestone: ---
I initialize openldap cluster with config:
``` dn: cn=config objectClass: olcGlobal cn: config olcPidFile: /run/openldap/slapd.pid olcArgsFile: /run/openldap/slapd.args olcServerID: 1 ldaps://ldap.ldap01.xxx.ru:637 olcServerID: 2 ldaps://ldap.ldap02.xxx.ru:637 olcServerID: 3 ldaps://ldap.ldap03.xxx.ru:637 olcTLSCACertificateFile: /etc/openldap/ssl/ca.pem olcTLSCertificateKeyFile: /etc/openldap/ssl/private.key olcTLSCertificateFile: /etc/openldap/ssl/server.crt
dn: cn=module,cn=config objectClass: olcModuleList cn: module olcModulepath: /usr/lib/openldap olcModuleload: back_mdb.so olcModuleload: syncprov.so
dn: cn=schema,cn=config objectClass: olcSchemaConfig cn: schema
include: file:///etc/openldap/schema/core.ldif
include: file:///etc/openldap/schema/cosine.ldif
include: file:///etc/openldap/schema/inetorgperson.ldif
include: file:///etc/openldap/schema/nis.ldif
dn: olcDatabase={0}config,cn=config objectClass: olcDatabaseConfig olcAccess: {0}to * by dn.exact=gidNumber=0+uidNumber=0,cn=peercred,cn=external,cn=auth manage olcRootPW: Ohtheis7ur9Qua6e olcSyncRepl: rid=001 provider=ldaps://ldap.ldap01.xxx.ru:637 binddn=cn=config bindmethod=simple credentials=Ohtheis7ur9Qua6e searchbase=cn=config type=refreshAndPersist retry="5 5 300 5" timeout=1 olcSyncRepl: rid=002 provider=ldaps://ldap.ldap02.xxx.ru:637 binddn=cn=config bindmethod=simple credentials=Ohtheis7ur9Qua6e searchbase=cn=config type=refreshAndPersist retry="5 5 300 5" timeout=1 olcSyncRepl: rid=003 provider=ldaps://ldap.ldap03.xxx.ru:637 binddn=cn=config bindmethod=simple credentials=Ohtheis7ur9Qua6e searchbase=cn=config type=refreshAndPersist retry="5 5 300 5" timeout=1 olcMirrorMode: TRUE
dn: olcOverlay=syncprov,olcDatabase={0}config,cn=config objectClass: olcOverlayConfig objectClass: olcSyncProvConfig olcOverlay: syncprov ```
Then I try to create many bases in a loop: My base template:
``` /etc/openldap/conf.d # cat > newdb.ldiff.template <<EOF! dn: olcDatabase={#DITID#}mdb,cn=config changetype: add objectClass: olcDatabaseConfig objectClass: olcMdbConfig olcDatabase: {#DITID#}mdb olcSuffix: dc=devmail,dc=srv,dc=local olcDbMaxSize: 1073741824 olcRootDN: cn=admin,dc=devmail,dc=srv,dc=local olcRootPW: 123 olcDbDirectory: /var/lib/openldap/openldap-data/ olcDbIndex: objectClass eq olcSyncRepl: rid=001 provider=ldaps://ldap.ldap01.xxx.local:637 binddn=cn=admin,dc=devmail,dc=srv,dc=local bindmethod=simple credentials=123 searchbase=dc=devmail,dc=srv,dc=local type=refreshAndPersist retry="5 5 300 5" timeout=1 olcSyncRepl: rid=002 provider=ldaps://ldap.ldap02.xxx.local:637 binddn=cn=admin,dc=devmail,dc=srv,dc=local bindmethod=simple credentials=123 searchbase=dc=devmail,dc=srv,dc=local type=refreshAndPersist retry="5 5 300 5" timeout=1 olcSyncRepl: rid=003 provider=ldaps://ldap.ldap03.xxx.local:637 binddn=cn=admin,dc=devmail,dc=srv,dc=local bindmethod=simple credentials=123 searchbase=dc=devmail,dc=srv,dc=local type=refreshAndPersist retry="5 5 300 5" timeout=1 olcMirrorMode: TRUE
dn: olcOverlay=syncprov,olcDatabase={#DITID#}mdb,cn=config changetype: add objectClass: olcOverlayConfig objectClass: olcSyncProvConfig olcOverlay: syncprov EOF! ```
For example 100 dbs
``` for I in $(seq 1 100);do sed -e "s/devmail/devmail${I}/g" -e "s/#DITID#/${I}/g" ./newdb.ldiff.template > newdb${I}.ldiff ldapmodify -H ldapi://%2Fvar%2Frun%2Fopenldap%2Fldapi -Y EXTERNAL -f ./newdb${I}.ldiff done ```
As a result my cluster first slows down and then nodes hang up. Logs show nothing, no activity. Connect via ldapmodify or other cli utilities just hangs. In such a situation rebooting a single node of the cluster helps, until the cluster becomes unresponsive once again.
Please tell me what could be the problem? Could my cluster configuration be incorrect?
The problem manifests itself only in situation where I have many databases, e.g. more than 10. With one-two dbs all works as expected. I have also tried using different in-built database backends, to no avail.
https://bugs.openldap.org/show_bug.cgi?id=9533
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID
--- Comment #1 from Quanah Gibson-Mount quanah@openldap.org --- (In reply to Andrey from comment #0)
olcDbDirectory: /var/lib/openldap/openldap-data/
Please tell me what could be the problem? Could my cluster configuration be incorrect?
They cannot use the same directory for the database, they must each have a unique path to their db.
https://bugs.openldap.org/show_bug.cgi?id=9533
--- Comment #2 from Andrey akrush24@gmail.com --- (In reply to Quanah Gibson-Mount from comment #1)
(In reply to Andrey from comment #0)
olcDbDirectory: /var/lib/openldap/openldap-data/
Please tell me what could be the problem? Could my cluster configuration be incorrect?
They cannot use the same directory for the database, they must each have a unique path to their db.
I added creating different dit in different directory, unfortunately it didn't help me. LDAP still freezes.
Exampe my config template: ``` # cat > newdb.ldiff.template <<EOF! dn: olcDatabase={#DITID#}mdb,cn=config changetype: add objectClass: olcDatabaseConfig objectClass: olcMdbConfig olcDatabase: {#DITID#}mdb olcSuffix: dc=devmail,dc=srv,dc=local olcDbMaxSize: 1073741824 olcRootDN: cn=admin,dc=devmail,dc=srv,dc=local olcRootPW: 123 olcDbDirectory: /var/lib/openldap/openldap-data/#DITID# olcDbIndex: objectClass eq olcSyncRepl: rid=001 provider=ldaps://ldap.ldap01.xxx.ru:637 binddn=cn=admin,dc=devmail,dc=srv,dc=local bindmethod=simple credentials=123 searchbase=dc=devmail,dc=srv,dc=local type=refreshAndPersist retry="5 5 300 5" timeout=1 olcSyncRepl: rid=002 provider=ldaps://ldap.ldap02.xxx.ru:637 binddn=cn=admin,dc=devmail,dc=srv,dc=local bindmethod=simple credentials=123 searchbase=dc=devmail,dc=srv,dc=local type=refreshAndPersist retry="5 5 300 5" timeout=1 olcSyncRepl: rid=003 provider=ldaps://ldap.ldap03.xxx.ru:637 binddn=cn=admin,dc=devmail,dc=srv,dc=local bindmethod=simple credentials=123 searchbase=dc=devmail,dc=srv,dc=local type=refreshAndPersist retry="5 5 300 5" timeout=1 olcMirrorMode: TRUE
dn: olcOverlay=syncprov,olcDatabase={#DITID#}mdb,cn=config changetype: add objectClass: olcOverlayConfig objectClass: olcSyncProvConfig olcOverlay: syncprov EOF! ```
https://bugs.openldap.org/show_bug.cgi?id=9533
--- Comment #3 from Ondřej Kuzník ondra@mistotebe.net --- On Tue, Apr 27, 2021 at 07:32:01AM +0000, openldap-its@openldap.org wrote:
I added creating different dit in different directory, unfortunately it didn't help me. LDAP still freezes.
dn: olcDatabase={#DITID#}mdb,cn=config [...] olcSyncRepl: rid=001
I would note that rids have to be unique within the same server.
https://bugs.openldap.org/show_bug.cgi?id=9533
--- Comment #4 from Andrey akrush24@gmail.com --- (In reply to Ondřej Kuzník from comment #3)
On Tue, Apr 27, 2021 at 07:32:01AM +0000, openldap-its@openldap.org wrote:
I added creating different dit in different directory, unfortunately it didn't help me. LDAP still freezes.
dn: olcDatabase={#DITID#}mdb,cn=config [...] olcSyncRepl: rid=001
I would note that rids have to be unique within the same server.
Thank, I made RID unique for all DIT, but it didn't help me. ``` # cat > newdb.ldiff.template <<EOF! dn: olcDatabase={#DITID#}mdb,cn=config changetype: add objectClass: olcDatabaseConfig objectClass: olcMdbConfig olcDatabase: {#DITID#}mdb olcSuffix: dc=devmail,dc=srv,dc=local olcDbMaxSize: 1073741824 olcRootDN: cn=admin,dc=devmail,dc=srv,dc=local olcRootPW: 123 olcDbDirectory: /var/lib/openldap/openldap-data/#DITID# olcDbIndex: objectClass eq olcSyncRepl: rid=#RIDID1# provider=ldaps://ldap.ldap01.xxx.ru:637 binddn=cn=admin,dc=devmail,dc=srv,dc=local bindmethod=simple credentials=123 searchbase=dc=devmail,dc=srv,dc=local type=refreshAndPersist retry="5 5 300 5" timeout=1 olcSyncRepl: rid=#RIDID2# provider=ldaps://ldap.ldap02.xxx.ru:637 binddn=cn=admin,dc=devmail,dc=srv,dc=local bindmethod=simple credentials=123 searchbase=dc=devmail,dc=srv,dc=local type=refreshAndPersist retry="5 5 300 5" timeout=1 olcSyncRepl: rid=#RIDID3# provider=ldaps://ldap.ldap03.xxx.ru:637 binddn=cn=admin,dc=devmail,dc=srv,dc=local bindmethod=simple credentials=123 searchbase=dc=devmail,dc=srv,dc=local type=refreshAndPersist retry="5 5 300 5" timeout=1 olcMirrorMode: TRUE
dn: olcOverlay=syncprov,olcDatabase={#DITID#}mdb,cn=config changetype: add objectClass: olcOverlayConfig objectClass: olcSyncProvConfig olcOverlay: syncprov EOF! ``` and apply in loop (of course /var/lib/openldap/openldap-data/1..40 exist) ``` RIDID=3 for I in $(seq 1 40);do RIDID=$((RIDID+1)) RIDID1="$(printf %03d ${RIDID})" RIDID=$((RIDID+1)) RIDID2=$(printf %03d ${RIDID}) RIDID=$((RIDID+1)) RIDID3=$(printf %03d ${RIDID}) sed -e "s/devmail/devmail${I}/g" -e "s/#RIDID1#/${RIDID1}/g" -e "s/#RIDID2#/${RIDID2}/g" -e "s/#RIDID3#/${RIDID3}/g" -e "s/#DITID#/${I}/g" ./newdb.ldiff.template > newdb${I}.ldiff ldapmodify -H ldapi://%2Fvar%2Frun%2Fopenldap%2Fldapi -Y EXTERNAL -f ./newdb${I}.ldiff done ```
https://bugs.openldap.org/show_bug.cgi?id=9533
--- Comment #5 from Howard Chu hyc@openldap.org --- (In reply to Andrey from comment #4)
(In reply to Ondřej Kuzník from comment #3)
On Tue, Apr 27, 2021 at 07:32:01AM +0000, openldap-its@openldap.org wrote:
I added creating different dit in different directory, unfortunately it didn't help me. LDAP still freezes.
dn: olcDatabase={#DITID#}mdb,cn=config [...] olcSyncRepl: rid=001
I would note that rids have to be unique within the same server.
Thank, I made RID unique for all DIT, but it didn't help me.
Suffix must also be unique.
https://bugs.openldap.org/show_bug.cgi?id=9533
--- Comment #6 from Howard Chu hyc@openldap.org --- (In reply to Howard Chu from comment #5)
(In reply to Andrey from comment #4)
(In reply to Ondřej Kuzník from comment #3)
On Tue, Apr 27, 2021 at 07:32:01AM +0000, openldap-its@openldap.org wrote:
I added creating different dit in different directory, unfortunately it didn't help me. LDAP still freezes.
dn: olcDatabase={#DITID#}mdb,cn=config [...] olcSyncRepl: rid=001
I would note that rids have to be unique within the same server.
Thank, I made RID unique for all DIT, but it didn't help me.
Suffix must also be unique.
Unable to reproduce any hang.
```
PROVIDER1=ldaps://ldap.ldap01.xxx.ru:637 PROVIDER2=ldaps://ldap.ldap02.xxx.ru:637 PROVIDER3=ldaps://ldap.ldap03.xxx.ru:637 RID=3 for I in $(seq 1 40); do RID=`expr $RID + 1` RIDID1=$RID RID=`expr $RID + 1` RIDID2=$RID RID=`expr $RID + 1` RIDID3=$RID DBDIR=testrun/db.$I mkdir $DBDIR SUFFIX="dc=devmail$I,dc=srv,dc=local" ldapmodify -a -H ldapi://%2Fvar%2Frun%2Fopenldap%2Fldapi -Y EXTERNAL <<EOF dn: olcDatabase={$I}mdb,cn=config objectClass: olcDatabaseConfig objectClass: olcMdbConfig olcDatabase: {$I}mdb olcSuffix: $SUFFIX olcDbMaxSize: 1073741824 olcRootDN: cn=admin,$SUFFIX olcRootPW: 123 olcDbDirectory: $DBDIR olcDbIndex: objectClass eq olcSyncRepl: rid=$RIDID1 provider=$PROVIDER1 binddn=cn=admin,$SUFFIX bindmethod=simple credentials=123 searchbase=$SUFFIX type=refreshAndPersist retry="5 5 300 5" timeout=1 olcSyncRepl: rid=$RIDID2 provider=$PROVIDER2 binddn=cn=admin,$SUFFIX bindmethod=simple credentials=123 searchbase=$SUFFIX type=refreshAndPersist retry="5 5 300 5" timeout=1 olcSyncRepl: rid=$RIDID3 provider=$PROVIDER3 binddn=cn=admin,$SUFFIX bindmethod=simple credentials=123 searchbase=$SUFFIX type=refreshAndPersist retry="5 5 300 5" timeout=1 olcMirrorMode: TRUE
dn: olcOverlay=syncprov,olcDatabase={$I}mdb,cn=config objectClass: olcOverlayConfig objectClass: olcSyncProvConfig olcOverlay: syncprov EOF done ```
Probably your $PROVIDER servers are just being slow to respond. Increase the number of slapd threads so that some workers are available while the consumers are busy trying to contact the providers.
This ticket remains INVALID. Software usage questions should have been directed to the openldap-technical mailing list.
https://bugs.openldap.org/show_bug.cgi?id=9533
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |VERIFIED
https://bugs.openldap.org/show_bug.cgi?id=9533
--- Comment #7 from Andrey akrush24@gmail.com ---
Probably your $PROVIDER servers are just being slow to respond. Increase the number of slapd threads so that some workers are available while the consumers are busy trying to contact the providers.
This ticket remains INVALID. Software usage questions should have been directed to the openldap-technical mailing list.
I am recreating ldap hangs, can you try running my example https://github.com/akrush24/openldap-replication?
```
git clone https://github.com/akrush24/openldap-replication # it's my own example cd openldap-replication ./up.sh time docker exec -ti ldap01 ./addnewdb.sh # attempt to create 333 dbs ./check.sh # check dbs count in every ldap servers
```
https://bugs.openldap.org/show_bug.cgi?id=9533
--- Comment #8 from Howard Chu hyc@openldap.org --- (In reply to Andrey from comment #7)
Probably your $PROVIDER servers are just being slow to respond. Increase the number of slapd threads so that some workers are available while the consumers are busy trying to contact the providers.
This ticket remains INVALID. Software usage questions should have been directed to the openldap-technical mailing list.
I am recreating ldap hangs, can you try running my example https://github.com/akrush24/openldap-replication?
This ticket remains INVALID. Software usage questions should have been directed to the openldap-technical mailing list.