Hi,
I have a particular object in my LDAP database that is failing to replicate (using syncrepl between two slapd's running 2.4.31-1+nmu2 on Debian Wheezy), despite other objects succeeding to replicate. I'm not using a 'filter' configuration in my olcSyncrepl configuration that might exclude this particular object, and I've checked that the binddn I'm using has permission to see this object all the attributes of the object that isn't replicating.
The (sanitised) configuration on the consumer is:
dn: olcDatabase={1}hdb,cn=config olcSyncrepl: {0}rid=104 provider=ldap://producer.example.com bindmethod=simple binddn="uid=replicator,ou=pseudoaccounts,dc=example,dc=com" credentials="..." searchbase="dc=example,dc=com" logbase="cn=accesslog" logfilter="(& (objectClass=auditWriteObject)(reqResult=0))" schemachecking=off type=refreshAndPersist retry="60 +" syncdata=accesslog starttls=critical tls_reqcert=demand
On the producer the overlay configuration for the database being replicated is:
dn: olcOverlay={1}syncprov,olcDatabase={1}hdb,cn=config objectClass: olcOverlayConfig objectClass: olcSyncProvConfig olcOverlay: {1}syncprov olcSpCheckpoint: 100 600 olcSpSessionlog: 100 olcSpNoPresent: TRUE
If I follow the sanitising I did in the above, then the failing object would be uid=replicationcheck,ou=pseudoaccounts,dc=example,dc=com, and a successfully replicated object would be uid=geoffc,ou=People,dc=example,dc=com.
I've stopped slapd on the consumer and deleted all the /var/lib/ldap/ database files, to force re-replication. I get the same symptoms, this one object doesn't replicate, but lots of other objects do replicate.
Any tips on how to further debug this?
Many thanks,
--On Friday, February 13, 2015 9:42 AM +1100 Geoff Crompton geoffc@trinity.unimelb.edu.au wrote:
Hi,
I have a particular object in my LDAP database that is failing to replicate (using syncrepl between two slapd's running 2.4.31-1+nmu2
Fail. This is a known bad release for syncrepl.
--Quanah
--
Quanah Gibson-Mount Platform Architect Zimbra, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
Am Fri, 13 Feb 2015 09:42:22 +1100 schrieb Geoff Crompton geoffc@trinity.unimelb.edu.au:
Hi,
I have a particular object in my LDAP database that is failing to replicate (using syncrepl between two slapd's running 2.4.31-1+nmu2 on Debian Wheezy), despite other objects succeeding to replicate. I'm not using a 'filter' configuration in my olcSyncrepl configuration that might exclude this particular object, and I've checked that the binddn I'm using has permission to see this object all the attributes of the object that isn't replicating.
The (sanitised) configuration on the consumer is:
dn: olcDatabase={1}hdb,cn=config olcSyncrepl: {0}rid=104 provider=ldap://producer.example.com bindmethod=simple binddn="uid=replicator,ou=pseudoaccounts,dc=example,dc=com" credentials="..." searchbase="dc=example,dc=com" logbase="cn=accesslog" logfilter="(& (objectClass=auditWriteObject)(reqResult=0))" schemachecking=off type=refreshAndPersist retry="60 +" syncdata=accesslog starttls=critical tls_reqcert=demand
On the producer the overlay configuration for the database being replicated is:
dn: olcOverlay={1}syncprov,olcDatabase={1}hdb,cn=config objectClass: olcOverlayConfig objectClass: olcSyncProvConfig olcOverlay: {1}syncprov olcSpCheckpoint: 100 600 olcSpSessionlog: 100 olcSpNoPresent: TRUE
If I follow the sanitising I did in the above, then the failing object would be uid=replicationcheck,ou=pseudoaccounts,dc=example,dc=com, and a successfully replicated object would be uid=geoffc,ou=People,dc=example,dc=com.
I've stopped slapd on the consumer and deleted all the /var/lib/ldap/ database files, to force re-replication. I get the same symptoms, this one object doesn't replicate, but lots of other objects do replicate.
Any tips on how to further debug this?
access rules? corrupted file system? corrupted database?
Dieter
On 13/02/15 18:46, Dieter Klünter wrote:
Am Fri, 13 Feb 2015 09:42:22 +1100 schrieb Geoff Crompton geoffc@trinity.unimelb.edu.au:
Hi,
I have a particular object in my LDAP database that is failing to replicate (using syncrepl between two slapd's running 2.4.31-1+nmu2 on Debian Wheezy), despite other objects succeeding to replicate. I'm not using a 'filter' configuration in my olcSyncrepl configuration that might exclude this particular object, and I've checked that the binddn I'm using has permission to see this object all the attributes of the object that isn't replicating.
[...]
Any tips on how to further debug this?
access rules? corrupted file system? corrupted database?
Dieter
I've checked access rules on the producer, and the account the consumer is replicating with has access to the object in question. I've confirmed the object doesn't exist on the consumer by reviewing the "slapcat" output, it's not access rules preventing me seeing the object.
I hope I've avoided a corrupted file system or corrupted database, as I"m seeing the same problem across multiple consumer VMs. At the same time some older VMs (running Debian Squeeze, and slapd 2.4.23-7.3) are not having this problem, which I think rules out corrupted filesystem or database on the producer.
This information supports Quannah's observation that the problem is probably the syncrepl bugs in 2.4.31.
I'm going to upgrade my consumer to the 2.4.40 backport that is now available in Debian Wheezy, and see if the problem goes away (and probably upgrade my producer likewise in my next maintenance window).
Geoff Crompton geoffc@trinity.unimelb.edu.au writes:
Any tips on how to further debug this?
Check syslog for error messages (if your slapd logs there). Try using the -d [...] option of slapd, it may give some insight. Try upgrading to version 2.4.31+really2.4.40-3~bpo70+1 in wheezy-backports.
openldap-technical@openldap.org