Occasional corrupt DN in be_add logs under 2.4.16 - openldap-technical

5 May 2009


      Hi,
As an interim measure while deploying 2.4.16 I am canarying 2.3.43 on a
replication provider.  As a result the current replication path is:
 master (2.3.39) -> provider (2.3.43) -> replica (2.4.16)
The master will be upgraded in short order once the 2.3.43 canary is
successful.
I've been seeing occasional corrupt DNs in some be_add log lines on the
2.4.16 replica:
May  5 09:35:46 host slapd[31817]: syncrepl_message_to_op: rid=100 be_add
<90>Y1 ntry,ou=subtree,dc=example,dc=com (0)
I've modified the DN in this log line.  The missing text is "cn=l" in this
example.  The original DN was 65 characters long.
I have performed the following search against each host with the following
results.  It shows that the entry replicated fine but capitalisation of the
DN differs (which may be a red herring since I was already aware that DN
capitalisation differed across servers):
$ ldapsearch -x -b cn=lntry,ou=subtree,dc=example,dc=com -s base dn
master: dn is cn=lntry,ou=Subtree,dc=example,dc=com
provider: dn is cn=lntry,ou=subtree,dc=example,dc=com
replica: dn is cn=lntry,ou=Subtree,dc=example,dc=com
slapcat shows no problems with the entry on the 2.4.16 host.
Since the database looks fine I wonder if this is just a logging issue.
Should this Debug statement in syncrepl.c actually use
op->ora_e->e_name.bv_val or some other attribute?
rc = op->o_bd->be_add( op, &rs );
Debug( LDAP_DEBUG_SYNC,
       "syncrepl_message_to_op: %s be_add %s (%d)\n",
       si->si_ridtxt, op->o_req_dn.bv_val, rc );
With the exception of si->si_rid becoming si->si_ridtxt (and %d->%s) this
Debug statement has not changed since 2.3.
-- 
Thanks,
Sean Burford