syncrepl replication on 2.4.19 (stable)
by Brett @Google
Hello,
I am having a very odd problem after upgrading from openldap 2.4.16 (stable)
I have a syncrepl provider/ consumer setup using openldap 2.4.19 (stable)
and when i start an empty consumer, in the provider logs i am getting:
Nov 4 17:07:51 producer slapd[7250]: [ID 702911 local4.debug] @(#)
$OpenLDAP: slapd 2.4.19 (Nov 4 2009 12:53:47) $
Nov 4 17:07:51 producer
@qgdevpro:/home/govops/build.local/openldap-2.4.19/servers/slapd
Nov 4 17:07:51 producer slapd[7286]: [ID 100111 local4.debug] slapd
starting
Nov 4 17:08:04 producer slapd[7286]: [ID 848112 local4.debug] conn=0 fd=16
ACCEPT from IP=10.0.0.2:53951 (IP=10.0.0.1:389)
Nov 4 17:08:04 producer slapd[7286]: [ID 215403 local4.debug] conn=0 op=0
BIND dn="cn=replicator,dc=example,dc=org" method=128
Nov 4 17:08:04 producer slapd[7286]: [ID 600343 local4.debug] conn=0 op=0
BIND dn="cn=replicator,dc=example,dc=org" mech=SIMPLE ssf=0
Nov 4 17:08:04 producer slapd[7286]: [ID 588225 local4.debug] conn=0 op=0
RESULT tag=97 err=0 text=
Nov 4 17:08:04 producer slapd[7286]: [ID 469902 local4.debug] conn=0 op=1
SRCH base="dc=example,dc=org" scope=2 deref=0 filter="(objectClass=*)"
Nov 4 17:08:04 producer slapd[7286]: [ID 744844 local4.debug] conn=0 op=1
SRCH attr=* +
Nov 4 17:08:04 producer slapd[7286]: [ID 832699 local4.debug] conn=0 op=1
SEARCH RESULT tag=101 err=0 nentries=0 text=
Nov 4 17:08:04 producer slapd[7286]: [ID 218904 local4.debug] conn=0 op=2
UNBIND
Nov 4 17:08:04 producer slapd[7286]: [ID 952275 local4.debug] conn=0 fd=16
closed
on the consumer i get a lot of (one set after each refresh attempt) :
Nov 4 17:41:04 consumer slapd[7660]: [ID 365351 local4.debug] do_syncrep2:
rid=001 LDAP_RES_SEARCH_RESULT
Nov 4 17:41:04 consumer slapd[7660]: [ID 664938 local4.debug] do_syncrepl:
rid=001 rc -2 retrying
Important part being "nentries=0", i run the equivalent command at the
command propmt of the consumer, ie:
ldapsearch -b dc=example,dc=org -D 'cn=replicator,dc=example,dc=org' -w
<password> -s sub -x '(objectclass=*) ' '* +'
I get the result i would expect above, ie:
Nov 4 17:20:14 producer slapd[7286]: [ID 848112 local4.debug] conn=16 fd=16
ACCEPT from IP=10.0.0.2:54049 (IP=10.0.0.1:389)
Nov 4 17:20:14 producer slapd[7286]: [ID 215403 local4.debug] conn=16 op=0
BIND dn="cn=replicator,dc=example,dc=org" method=128
Nov 4 17:20:14 producer slapd[7286]: [ID 600343 local4.debug] conn=16 op=0
BIND dn="cn=replicator,dc=example,dc=org" mech=SIMPLE ssf=0
Nov 4 17:20:14 producer slapd[7286]: [ID 588225 local4.debug] conn=16 op=0
RESULT tag=97 err=0 text=
Nov 4 17:20:14 producer slapd[7286]: [ID 469902 local4.debug] conn=16 op=1
SRCH base="dc=example,dc=org" scope=2 deref=0 filter="(objectClass=*)"
Nov 4 17:20:14 producer slapd[7286]: [ID 744844 local4.debug] conn=16 op=1
SRCH attr=* +
Nov 4 17:21:03 producer slapd[7286]: [ID 832699 local4.debug] conn=16 op=1
SEARCH RESULT tag=101 err=0 nentries=85611 text=
Nov 4 17:21:03 producer slapd[7286]: [ID 218904 local4.debug] conn=16 op=2
UNBIND
Nov 4 17:21:03 producer slapd[7286]: [ID 952275 local4.debug] conn=16 fd=16
closed
Note here i get nentries=85611 (with a phole bunch of results) for what is
essentialy the same query.
I'd appreciate any feedback, surely i must be missing something really
obvious?
My config is below.
Cheers
Brett
<< begin of provider slapd >>
######################################################################
# global options
######################################################################
include /usr/local/openldap/etc/openldap/schema/core.schema
include /usr/local/openldap/etc/openldap/schema/cosine.schema
include /usr/local/openldap/etc/openldap/schema/inetorgperson.schema
modulepath /usr/local/openldap/libexec/openldap
#moduleload back_ldbm.la
#moduleload back_monitor.la
pidfile /var/openldap/run/slapd.pid
argsfile /var/openldap/run/slapd.args
# threads for faster concurrent slapadd
tool-threads 4
######################################################################
# global database ACLs
######################################################################
# allow replicator to read all
access to *
by dn.exact="cn=replicator,dc=example,dc=org" read
by * break
[ ..etc.. ]
# default rules
access to *
by self write
by * read
######################################################################
# logging configuration
######################################################################
# testing
loglevel stats sync
######################################################################
# primary database
######################################################################
database hdb
suffix "dc=example,dc=org"
directory /var/openldap/data
rootdn "cn=Manager, dc=example,dc=org"
rootpw <password>
checkpoint 2000 15
cachesize 20000
idlcachesize 60000
cachefree 4000
# unlimited dn cache (openldap 2.4.16 and above)
dncachesize 0
# General Indexes (there is more than this - but they are all the same form)
index default pres,eq
index objectClass,uid,mail pres,eq
index cn,sn,ou,streetAddress,givenName,title,telephoneNumber eq,sub
# Indices for Syncrepl
index entryCSN,entryUUID eq
# allow replicator DN have unlimited searches (per-database)
limits dn.exact="cn=replicator,dc=example,dc=org" time=unlimited
size=unlimited
######################################################################
# replication information - monitor backend
######################################################################
database monitor
<< end of provider slapd >>
<< below snipit added to above on the consumer only, just before "database
monitor", but after the rest of the config >>
######################################################################
# replication information - only for consumer
######################################################################
# Where we pull data from
syncrepl rid=001
provider=ldap://provider.example.org:389
bindmethod=simple
binddn="cn=replicator,dc=example,dc=org"
credentials=<password>
searchbase="dc=example,dc=org"
filter=(objectclass=*)
attrs="*,+"
schemachecking=off
scope=sub
type=refreshAndPersist
retry="60 +"
# not using accesslog atm - debugging initial refresh
# logbase="cn=accesslog"
# logfilter="(&(objectClass=auditWriteObject)(reqResult=0))"
# syncdata=accesslog
# Refer all rights to master
updateref ldap://provider.example.org:389
13 years, 6 months
Troubleshooting synchronization
by Torsten Schlabach (Tascel eG)
Hi all!
I am currently trying to chase some problems in an n-way multi-master
setup with three servers. We have used the instructions at
http://www.openldap.org/doc/admin24/replication.html#N-Way%20Multi-Master
as our guidance and we are using OpenLDAP version 2.4.11.
The result we see currently is that replication works only partially,
with some strange errors here and there.
As I believe it will be pointless to post all our cn=config LDIF here
and explain scenarios which work and those which don't, I thought it
would be more productive to double-check that I have correctly
understand what I *should* be seeing happen on my systems and how I can
properly monitor this. My problem may be that I still need to learn how
to properly monitor my slapd.
To begin with, I would just ask for confirmation of my proper
understanding of the documentation:
1. A master server is a server which is using the syncprov overlay
(servers/slapd/overlays/syncprov.c). This overlay will do little more
than just provide a synchronization cookie (CSN) which consumers may ask
for to find out what needs to get replicated and what not.
2. A consumer server is a server in which an additional thread is
running which will query the master(s) in a given interval to ask for
updated and if any, get them over the wire and into the local copy of
the database. This synchronization thread is servers/slapd/syncrepl.c I
guess?
3. An N-Way Multi-Master setup is a setup in which N servers are each a
master and any of the others is a consumer of all other masters?
I am I right up to here?
So what I fail to understand is:
1. What is the difference between Mirror Mode and N-Way Multi-Master?
Especially given that in N-Way Multi-Master, have to set olcMirrorMode
to TRUE.
2. Given that I have added a 'Sync' value to the olcLogLevel attribute,
what would be the "health check" information I should be watching in the
log for to see that replication is attempted as expected.
3. What problems should I be watching for in the logs?
4. Could I for example manually ask a master (using some ldapsearch
statement, pretending I was the consumer) what the master thinks which
entries I would have to update?
Regards,
Torsten
13 years, 6 months
Anonymous Syncrepl?
by Eric B.
Hi,
I'm relatively new to OpenLDAP and am trying to set up a slave server. I
figured the easiest way would be to use the anonymous user to perform the
synchronization given that my master allows for full anonymous reads:
access to *
by self write
by users read
by anonymous read
I have tried to specify the following in my slave slapd.conf:
syncrepl rid=8
provider=ldap://snoopy.domain.com:389
type=refreshAndPersist
retry="60 +"
searchbase="dc=domain,dc=com"
schemachecking=off
bindmethod=simple
However, my slave seems to be unable to connect properly to the master. It
seems to be trying to write something, and am not quite sure what. My
master has the following log:
Nov 9 16:37:52 snoopy slapd[1481]: conn=6270 fd=72 ACCEPT from
IP=10.1.1.8:39558 (IP=0.0.0.0:389)
Nov 9 16:37:52 snoopy slapd[1481]: conn=6270 op=0 BIND dn="" method=128
Nov 9 16:37:52 snoopy slapd[1481]: conn=6270 op=0 RESULT tag=97 err=0 text=
Nov 9 16:37:52 snoopy slapd[1481]: conn=6270 op=1 SRCH
base="dc=domain,dc=com" scope=2 deref=0 filter="(objectClass=*)"
Nov 9 16:37:52 snoopy slapd[1481]: conn=6270 op=1 SRCH attr=* +
Nov 9 16:37:52 snoopy slapd[1481]: send_search_entry: conn 6270 ber write
failed.
Nov 9 16:37:52 snoopy slapd[1481]: conn=6270 fd=72 closed (connection lost
on write)
My slave logs display the following:
Nov 9 16:45:36 spike slapd[32415]: do_syncrep2: rid 008got search entry
without control
Nov 9 16:45:36 spike slapd[32415]: do_syncrepl: rid 008 retrying
I thought it might have something to do with the type (in that
refreshAndPersist may require some form of write privileges), so I switched
to type refreshOnly, however, it made no difference. My log outputs remain
the same.
Can anyone steer me in the correct direction?
Thanks,
Eric
_________________________________________________________________
Windows Live: Keep your friends up to date with what you do online.
http://go.microsoft.com/?linkid=9691815
13 years, 6 months
hdb_search: does not match filter
by Antonini Gabriele
Hello,
I'm running openldap 2.3.43 on CentOS.
I'm tring to setup a working master-slave sync system.
I populate master server using slapadd -q -l from a ~160k entry ldif file. Every time I (re)start master server it takes long time to become available (about 20 minutes), in the logs I read these entries:
Nov 10 13:56:17 ldap01 slapd[2191]: hdb_search: 1 does not match filter
Nov 10 13:56:17 ldap01 slapd[2191]: entry_decode: ""
Nov 10 13:56:17 ldap01 slapd[2191]: <= entry_decode()
Nov 10 13:56:17 ldap01 slapd[2191]: hdb_search: 2 does not match filter
Nov 10 13:56:17 ldap01 slapd[2191]: entry_decode: ""
Nov 10 13:56:17 ldap01 slapd[2191]: <= entry_decode()
Nov 10 13:56:17 ldap01 slapd[2191]: hdb_search: 3 does not match filter
Nov 10 13:56:17 ldap01 slapd[2191]: entry_decode: ""
...
...
...
Nov 10 14:17:58 ldap01 slapd[32246]: hdb_search: 162959 does not match filter
Nov 10 14:17:58 ldap01 slapd[32246]: entry_decode: ""
Nov 10 14:17:58 ldap01 slapd[32246]: <= entry_decode()
Nov 10 14:17:58 ldap01 slapd[32246]: send_ldap_result: conn=-1 op=0 p=0
Nov 10 14:17:58 ldap01 slapd[32246]: slapd starting
If I try to comment the syncprov section:
overlay syncprov
syncprov-checkpoint 100 10
syncprov-sessionlog 100
in slapd.conf the server startup is very quick (2-3 seconds).
Is there something wrong in what I'm doing? Is this very long startup normal?
Thanks,
G.
13 years, 6 months
Bug when converting syncprov-checkpoint to olcSpCheckpoint?
by Kyle Blaney
I am experiencing unexpected results when converting the
syncprov-checkpoint option in file-based configuration to the
olcSpCheckpoint attribute in online configuration.
To reproduce the unexpected results:
1. Configure OpenLDAP using file-based configuration that contains the
following line:
syncprov-checkpoint 100 10
2. Convert file-based configuration to online configuration using the
following command:
slapd -f slapd.conf -F slapd.d
My file-based configuration is converted to the cn=config database, but
after conversion the value of the olcSpCheckpoint attribute in the
olcOverlay={1}syncprov,olcDatabase={1}bdb,cn=config entry is "100 600"
when I expect it to be the same as in my file-based configuration ("100
10").
Is this a bug or does the olcSpCheckpoint attribute use different units
(seconds) than the syncprov-checkpoint option (minutes)?
Kyle Blaney
13 years, 6 months
Syncrepl: Two step forward, one step back
by Peter Mogensen
Hi,
I still can't figure out what to think of this:
I have a mirrormode setup. One server has the database, the other starts
empty. The data is slowly propagating from server-1 to server-2, but
when I monitor the process by counting how many objects are created on
server-2 below the root object, it is not strictly increasing.
All object below the root has DN: o=...
Is see output like this:
# slapcat | grep 'dn: o=' | wc -l
601
# slapcat | grep 'dn: o=' | wc -l
622
# slapcat | grep 'dn: o=' | wc -l
620
# slapcat | grep 'dn: o=' | wc -l
628
Why does server-2 "regret" already replicated objects?
Is this expected behaviour?
/Peter
13 years, 6 months
2.4.19 (stable) - sync replication issue
by Ken Ko
Hello
I have TWO openldap servers (ServerA & ServerB) and recently i upgraded to 2.4.19.
Now when I add a record into ServerA, the new record appear inside both ServerA & ServerB, then 20 minutes later, the record will be deleted in both servers.
Here is the log that i discover:
Nov 4 23:33:48 srr200-001 slapd[28100]: nonpresent_callback: rid=002 nonpresent UUID e73bcd9c-5da0-102e-96f1-8d1c45da731c, dn uid=amyzjkang,ou=users,dc=ygmt,dc=com
Nov 4 23:33:48 srr200-001 slapd[28100]: syncrepl_del_nonpresent: rid=002 be_delete uid=amyzjkang,ou=users,dc=ygmt,dc=com (0)
Here is my conf file:
serverID 1
include /etc/openldap/schema/core.schema
include /etc/openldap/schema/cosine.schema
include /etc/openldap/schema/inetorgperson.schema
include /etc/openldap/schema/rfc2307bis.schema
include /etc/openldap/schema/yast.schema
include /etc/openldap/schema/samba3.schema
include /etc/openldap/schema/dnszone.schema
include /etc/openldap/schema/ygmt.schema
pidfile /var/run/slapd/slapd.pid
argsfile /var/run/slapd/slapd.args
# Load dynamic backend modules:
modulepath /usr/lib/openldap/modules
access to attrs=SambaLMPassword,SambaNTPassword
by dn="uid=administrator,ou=users,dc=ygmt,dc=com" write
by * none
access to dn.base=""
by * read
access to dn.base="cn=Subschema"
by * read
access to attrs=userPassword,userPKCS12
by self write
by * auth
access to attrs=shadowLastChange
by self write
by * read
access to *
by * read
loglevel 16384
TLSCertificateFile /etc/ssl/servercerts/servercert.pem
TLSCACertificatePath /etc/ssl/certs/
TLSCertificateKeyFile /etc/ssl/servercerts/serverkey.pem
database bdb
suffix "dc=ygmt,dc=com"
rootdn "uid=administrator,ou=users,dc=ygmt,dc=com"
rootpw "12345678pass"
directory /var/lib/ldap
checkpoint 1024 5
cachesize 10000
index objectClass,uidNumber,gidNumber eq
index member,mail eq,pres
index cn,displayname,uid,sn,givenname sub,eq,pres
index sambaSID eq
index sambaPrimaryGroupSID eq
index sambaDomainName eq
index entryCSN,entryUUID eq
index memberUid eq
index uniqueMember eq,pres
index sambaSIDList eq
index sambaGroupType eq
overlay memberof
syncrepl rid=001
provider=ldap://172.16.2.1
searchbase="dc=ygmt,dc=com"
bindmethod=simple
binddn="uid=administrator,ou=users,dc=ygmt,dc=com"
credentials=12345678pass
type=refreshOnly
interval=00:00:05:00
retry="20 5 300 +"
schemachecking=off
sizelimit=unlimited
timelimit=unlimited
mirrormode on
overlay syncprov
Both of my server slapd.conf is identical except serverID and the provider=ldap ip address
The ldap syncrepl was working previously.
Since the upgrade, i can't add any users if the syncrepl is turn on.
Now if i want to add a record, I have to manually turn off syncrepl. Then add the user into ServerA, Stop ServerB ldap and remove ldap DB, turn on syncrepl, restart & re sync.
How can i fix this issue?
Thanks~
Ken
Windows Live: Keep your friends up to date with what you do online.
_________________________________________________________________
Eligible CDN College & University students can upgrade to Windows 7 before Jan 3 for only $39.99. Upgrade now!
http://go.microsoft.com/?linkid=9691819
13 years, 6 months
Syncrepl: 3 simple questions
by Torsten Schlabach (Tascel eG)
Hi!
What I would like to understand:
1. How do I query a master for the cookie?
2. How do I query a slave for the coolie?
3. How do I query the master in a human readable format for all changes
based on a cookie which I present?
My apologies if these questions should have been answered by reading the
documentation; I did not find it there, unfortunately.
I am sure someone who is a bit more into this can just reply with three
simple ldadpsearch statemenets.
Regards,
Torsten
13 years, 6 months
/var/lib/ldap/log.000*
by Ed Greenberg
I have a setup with a master and four slaves, all of which have a number
of files in the format /var/lib/ldap/log.000*
The replication is syncrepl.
Are all these files needed? How are they pruned/maintained? I am running
out of disk space :)
What can I read about this? I couldn't find anything.
Thanks,
</edg>
13 years, 6 months