Replication appears to vanish, sometimes.

24 Mar 2008


      Hello list,
SunOS ldapmaster01.unix 5.10 Generic_118844-26 i86pc i386 i86pc
openldap-2.3.19
db-4.2.52.NC-sol10-intel-local
./configure LDFLAGS=-L/usr/local/lib --enable-crypt
Generally, Openldap is working well, and replication is working. I get 
no .rej files, and the replog file itself is frequently drained.
But, ldapmaster and ldapslave frequently get out of sync. I thought it 
was initially just on the modrdn operations, but we have had "simpler" 
operations vanish.
I realise this isn't much to go on, but perhaps I can get suggestions 
which may lead to more clues.
The important looking sections in slapd.conf:
loglevel none
database        bdb
suffix          "dc=company,dc=jp"
rootdn          "cn=admin,dc=company,dc=jp"
access to attr=userPassword
         by self write
         by anonymous auth
         by * none
access to *
         by self write
         by dn="cn=admin,dc=company,dc=jp" write
         by * read
password-hash {CRYPT}
checkpoint 128 15
directory       /usr/local/var/openldap-data
index   objectClass     eq
index   uid                     eq
index   uidNumber               eq
index   mail                    eq
index   mailAlternateAddress    pres,eq
index   deliveryMode            eq
index   accountStatus           eq
index   gecos                   eq
index   radiusGroupName         eq
index   o                       pres,eq
replogfile /usr/local/var/openldap-slurp/replica/slurpd.replog
replica host=172.20.12.23:389
         binddn="cn=admin,dc=company,dc=jp"
         bindmethod=simple
         credentials="<secret>"
ldapslave slapd.conf is identical (it really is scp'ed) but with the 
last two lines removed (replogfile, replica).
# /usr/local/bin/ldapsearch -h ldapmaster01 -D cn=admin,dc=company,dc
=jp -x -w secret -b ou=mail,dc=company,dc=jp mail=test01@cs-sd01.com
# extended LDIF
#
# LDAPv3
# base <ou=mail,dc=company,dc=jp> with scope subtree
# filter: mail=test01@cs-sd01.com
# requesting: ALL
#
# test01, cs-sd01.com, mail, company.jp
dn: uid=test01,o=cs-sd01.com,ou=mail,dc=company,dc=jp
objectClass: top
objectClass: account
objectClass: posixAccount
objectClass: shadowAccount
objectClass: qmailUser
objectClass: amavisAccount
uid: test01
cn: test01
[snip]
# search result
search: 2
result: 0 Success
# numResponses: 2
# numEntries: 1
# /usr/local/bin/ldapsearch -h ldapslave01 -D cn=admin,dc=company,dc=jp 
-x -w <secret> -b ou=mail,dc=company,dc=jp mail=test01@cs-sd01.com
# extended LDIF
#
# LDAPv3
# base <ou=mail,dc=secret,dc=jp> with scope subtree
# filter: mail=test01@cs-sd01.com
# requesting: ALL
#
# search result
search: 2
result: 0 Success
# numResponses: 1
The various slapdlogs appear to contain little of interest, but perhaps 
"loglevel none" is the wrong setting?
ldapmaster usually gives lines like:
Mar 25 09:51:23 ldapmaster01.unix slapd[1319]: [ID 458966 local4.debug] 
do_search: invalid dn (uid=a.user, o=, ou=mail, dc=company, dc=jp)
Mar 25 09:51:25 ldapmaster01.unix slapd[1319]: [ID 458966 local4.debug] 
do_search: invalid dn (uid=user4447, o=, ou=mail, dc=company, dc=jp)
Mar 25 09:51:29 ldapmaster01.unix slapd[1319]: [ID 458966 local4.debug] 
do_search: invalid dn (uid=6201535, o=, ou=mail, dc=company, dc=jp)
Which is just users entering their usernames wrong.
ldapslave01 has no real log entries:
Mar 24 22:42:41 ldapslave01.unix slapd[25495]: [ID 854361 local4.debug] 
bdb_db_open: unclean shutdown detected; attempting recovery.
Mar 24 22:42:41 ldapslave01.unix slapd[25495]: [ID 100111 local4.debug] 
slapd starting
When things are out of sync, I stop ldapslave01, stop provisioning (the 
only thing writing to ldap), then rsync everything from ldapmaster, and 
start it again.
We are considering running without ldapslave until it can be resolved, 
it will just a bit slower. As I said, it is not all replication that 
fails, just some.
There was a period where ldapslave was experiencing "too many files open 
error", but we increased file descriptors since then (6 weeks ago).
I do also have a DBCONFIG file.
Lund
-- 
Jorgen Lundman       | lundman@lundman.net
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo    | +81 (0)90-5578-8500          (cell)
Japan                | +81 (0)3 -3375-1767          (home)

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Replication appears to vanish, sometimes.