Hi,
As an interim measure while deploying 2.4.16 I am canarying 2.3.43 on a replication provider. As a result the current replication path is:
master (2.3.39) -> provider (2.3.43) -> replica (2.4.16)
The master will be upgraded in short order once the 2.3.43 canary is successful.
I've been seeing occasional corrupt DNs in some be_add log lines on the 2.4.16 replica:
May 5 09:35:46 host slapd[31817]: syncrepl_message_to_op: rid=100 be_add <90>Y1 ntry,ou=subtree,dc=example,dc=com (0)
I've modified the DN in this log line. The missing text is "cn=l" in this example. The original DN was 65 characters long.
I have performed the following search against each host with the following results. It shows that the entry replicated fine but capitalisation of the DN differs (which may be a red herring since I was already aware that DN capitalisation differed across servers):
$ ldapsearch -x -b cn=lntry,ou=subtree,dc=example,dc=com -s base dn
master: dn is cn=lntry,ou=Subtree,dc=example,dc=com
provider: dn is cn=lntry,ou=subtree,dc=example,dc=com
replica: dn is cn=lntry,ou=Subtree,dc=example,dc=com
slapcat shows no problems with the entry on the 2.4.16 host.
Since the database looks fine I wonder if this is just a logging issue. Should this Debug statement in syncrepl.c actually use op->ora_e->e_name.bv_val or some other attribute?
rc = op->o_bd->be_add( op, &rs );
Debug( LDAP_DEBUG_SYNC,
"syncrepl_message_to_op: %s be_add %s (%d)\n",
si->si_ridtxt, op->o_req_dn.bv_val, rc );
With the exception of si->si_rid becoming si->si_ridtxt (and %d->%s) this Debug statement has not changed since 2.3.
--
Thanks,
Sean Burford