On Friday 27 June 2008 19:26:22 Liutauras Adomaitis wrote:
hello everybody,
I'm quite new to OpenLdap. Actually i've been using it for a few years, but
I have no deep knowlege.
The problem I'm facing is my cosumer replicas are segfaulting.
There were a number of fixes to syncrepl in 2.4.9.
My design:
I have one master with several o=BranchX,dc=example,dc=com This is
provider. I have several (the number is X-1) replicas, consumers.
All consumers are replicating its branch o=BranchX,dc=example,dc=com and
one common branch o=BranchMain,dc=example,dc=com.
The picture is like this:
Provider
o=BranchMain,dc=example,dc=com
o=Branch1,dc=example,dc=com
o=Branch2,dc=example,dc=com
.....
o=BranchX,dc=example,dc=com
Consumer 1:
o=BranchMain,dc=example,dc=com
o=Branch1,dc=example,dc=com
Consumer 2:
o=BranchMain,dc=example,dc=com
o=Branch2,dc=example,dc=com
But it seems you have implemented this by using a single database at
dc=example,dc=com, with multiple syncrepl statements (one for each subtree
that you replicate). As far as I know, this in not supported. Instead, you
should consider using a separate database for each syncrepl statement, and
glue the databases together by using the 'subordinate' statement in each
sub-tree database.
This would look something like this:
database bdb
suffix o=BranchMain,dc=example,dc=com
subordinate
syncrepl ...
[...]
database bdb
suffix o=Branch1,dc=example,dc=com
subordinate
syncrepl ...
[...]
database bdb
suffix dc=example,dc=com
syncrepl ...
At the begining I had one consumer, which was segfaultin just
randomly once
or twice a day. I decided to comment out my syncrepl directives in conf
file and now it is running for a day and half.
I'm running 2.4.9 or 2.4.10 on my own systems at present, but I didn't see any
problems like this on small databases with 2.4.8.
I should mention, that after
cosumer segfaults I cannot start slapd any more. The only solution I have
is to delete ol /var/lib/ldap (all database) directory contents and then
restarting slapd.
Did running database recovery (/etc/init.d/ldap recover) make any difference
here?
If restarting slapd on the old database - segfaulti is
happening.
Since this was a smaill branch and only one branch I thought to debug the
problem later. Today I faced the same situation on a biger consumer. The
same situation. slapd just crashed and only deleting database helped me to
start it again.
For a larger database, you should have some database cache specified in the
DB_CONFIG file in the database directory.
My systems are Mandriva 2008.1 with slapd version:
@(#) $OpenLDAP: slapd 2.4.8 (Mar 23 2008 16:49:39) $
mandrake(a)klodia.mandriva.com:
/home/mandrake/rpm/BUILD/openldap-2.4.8/servers/slapd
I am considering shipping an official update, most likely to 2.4.10. In the
meantime, I have released a 2.4.10 to backports for 2008.1. If fixing your
configuration doesn't address all your stability problems, you may want to
consider upgrading to that package.
I have one branch runing old slapd versions (the ones comming with
Mandriva
2007.0), but they seem to work except that I can have replicated only one
branch (one rid).
See above, the multiple-database (one syncrepl statement per database) would
work in 2.3 as well.
Seems old slapd doesn't support several rids.
And the new slapd mainly supports multiple syncrepl statements in the same
database for multi-master replication, not for the design you've chosen.
Can anybody help me to debug this situation? This configuration is
rather
new but I was thinking to build all infrastructure on such a configuration,
so segfaulting is very big issue.
Provider (master) configuration is:
include /usr/share/openldap/schema/core.schema
include /usr/share/openldap/schema/cosine.schema
include /usr/share/openldap/schema/corba.schema
include /usr/share/openldap/schema/inetorgperson.schema
include /usr/share/openldap/schema/nis.schema
include /usr/share/openldap/schema/openldap.schema
include /usr/share/openldap/schema/samba.schema
include /usr/share/openldap/schema/qmail.schema
include /etc/openldap/schema/local.schema
include /etc/openldap/slapd.access.conf
access to dn.subtree="dc=example,dc=com"
by group="cn=Replicator,ou=Group,dc=example,dc=com"
by users read
by anonymous read
pidfile /var/run/ldap/slapd.pid
argsfile /var/run/ldap/slapd.args
modulepath /usr/lib64/openldap
moduleload syncprov.la
TLSRandFile /dev/random
TLSCipherSuite HIGH:MEDIUM:+SSLv2+SSLv3
TLSCertificateFile /etc/pki/tls/certs/slapd.pem
TLSCertificateKeyFile /etc/pki/tls/certs/slapd.pem
TLSCACertificatePath /etc/pki/tls/certs/
TLSCACertificateFile /etc/pki/tls/certs/ca-bundle.crt
TLSVerifyClient never # ([never]|allow|try|demand)
database bdb
suffix "dc=example,dc=com"
rootdn "cn=Manager,dc=example,dc=com"
rootpw secret
directory /var/lib/ldap
checkpoint 256 5
index mailAlternateAddress eq,sub
index accountStatus,mailHost,deliveryMode eq
index default sub
index objectClass eq
index cn,mail,surname,givenname
eq,subinitial
index uidNumber,gidNumber,memberuid,member,uniqueMember eq
index uid
eq,subinitial
index sambaSID,sambaDomainName,displayName eq
index entryCSN,entryUUID eq
limits group="cn=Replicator,dc=infosaitas,dc=lt"
size=unlimited
time=unlimited
access to *
by group="cn=Replicator,dc=infosaitas,dc=lt" write
by * read
overlay syncprov
syncprov-checkpoint 100 10
syncprov-sessionlog 10
Consumers configuration (all the same):
include /usr/share/openldap/schema/core.schema
include /usr/share/openldap/schema/cosine.schema
include /usr/share/openldap/schema/corba.schema
include /usr/share/openldap/schema/inetorgperson.schema
include /usr/share/openldap/schema/nis.schema
include /usr/share/openldap/schema/openldap.schema
include /usr/share/openldap/schema/samba.schema
include /usr/share/openldap/schema/qmail.schema
include /etc/openldap/schema/local.schema
include /etc/openldap/slapd.access.conf
include /etc/openldap/slapd.access.ldapauth.conf
access to dn.subtree="dc=example,dc=com"
by group="cn=Replicator,ou=Group,dc=example,dc=com"
by users read
by anonymous read
pidfile /var/run/ldap/slapd.pid
argsfile /var/run/ldap/slapd.args
modulepath /usr/lib64/openldap
moduleload back_ldap.la
TLSCertificateFile /etc/ssl/openldap/ldap.pem
TLSCertificateKeyFile /etc/ssl/openldap/ldap.pem
TLSCACertificateFile /etc/ssl/openldap/ldap.pem
overlay chain
chain-uri "ldap://master.server"
chain-idassert-bind bindmethod="simple"
binddn="cn=Manager,dc=example,dc=com"
credentials=secret
mode="none"
chain-tls start
chain-return-error TRUE
database bdb
suffix "dc=example,dc=com"
rootdn "cn=Manager,dc=example,dc=com"
rootpw secret
directory /var/lib/ldap
checkpoint 256 5
index objectClass eq
index mailAlternateAddress eq,sub
index accountStatus,mailHost,deliveryMode eq
index default sub
index cn,mail,surname,givenname
eq,subinitial
index uidNumber,gidNumber,memberuid,member,uniqueMember eq
index uid
eq,subinitial
index sambaSID,sambaDomainName,displayName eq
limits group="cn=Replicator,ou=Group,dc=example,dc=com"
size=unlimited
time=unlimited
syncrepl rid=1
provider=ldap://master.server:389
type=refreshAndPersist
retry="60 +"
searchbase="o=BranchMain,dc=example,dc=com"
filter="(objectClass=*)"
scope=sub
attrs=*
Using attrs=* will mean you don't replicate operational attributes, either
leave attrs unspecified, or use the default ('attrs=*,+').
schemachecking=off
bindmethod=simple
binddn="cn=Manager,dc=example,dc=com"
credentials=secret
starttls=yes
syncrepl rid=2
provider=ldap://master.server:389
type=refreshAndPersist
retry="60 +"
searchbase="o=Branch1,dc=example,dc=com"
filter="(objectClass=*)"
scope=sub
attrs=*
schemachecking=off
bindmethod=simple
binddn="cn=Manager,dc=example,dc=com"
credentials=secret
starttls=yes
updateref ldap://master.server
Regards,
Buchan