https://bugs.openldap.org/show_bug.cgi?id=10080
Issue ID: 10080 Summary: refreshAndPersist synchronization problem with glue + rwm Product: OpenLDAP Version: 2.6.2 Hardware: All OS: All Status: UNCONFIRMED Keywords: needs_review Severity: normal Priority: --- Component: overlays Assignee: bugs@openldap.org Reporter: homma@allworks.co.jp Target Milestone: ---
Created attachment 972 --> https://bugs.openldap.org/attachment.cgi?id=972&action=edit Stack trace of segfault
I have an openldap 2.6.2 server "ldap1" with the following DIT:
dc=example,dc=com (back-mdb) ou=users ou=local cn=admin cn=sync ... ou=remote (back-ldap -> ldaps://dc1.example.com) ...
Local user entries are created under subtree "ou=local,ou=users,dc=example,dc=com", and the subtree "ou=remote,ou=users,dc=example,dc=com" is a proxy to an Active Directory server "dc1.example.com" (subtree "ou=users,dc=ad,dc=example,dc=com").
The concrete configuration is as follows: ---------------- dn: olcDatabase={2}ldap,cn=config objectClass: olcDatabaseConfig objectClass: olcLDAPConfig olcDatabase: {2}ldap olcSuffix: ou=remote,ou=users,dc=example,dc=com olcSubordinate: TRUE olcRootDN: gidNumber=0+uidNumber=0,cn=peercred,cn=external,cn=auth olcDbURI: ldaps://dc1.example.com olcDbIDAssertBind: bindmethod=simple binddn="cn=aduser,ou=users,dc=ad,dc=example,dc=com" credentials=secret tls_reqcert=demand mode=none olcDbIDAssertAuthzFrom: {0}dn.exact:gidNumber=0+uidNumber=0,cn=peercred,cn=external,cn=auth olcDbIDAssertAuthzFrom: {1}dn.exact:cn=admin,ou=local,ou=users,dc=example,dc=com
dn: olcOverlay={0}rwm,olcDatabase={2}ldap,cn=config objectClass: olcOverlayConfig objectClass: olcRwmConfig olcOverlay: {0}rwm olcRwmRewrite: {0}rwm-suffixmassage "ou=users,dc=ad,dc=example,dc=com" olcRwmMap: {0}objectclass inetOrgPerson organizationalPerson olcRwmMap: {1}objectclass posixAccount user olcRwmMap: {2}attribute uid sAMAccountName olcRwmMap: {3}attribute homeDirectory unixHomeDirectory olcRwmMap: {4}attribute ou * olcRwmMap: {5}attribute cn * olcRwmMap: {6}attribute sn * olcRwmMap: {7}attribute givenName * olcRwmMap: {8}attribute mail * olcRwmMap: {9}attribute uidNumber * olcRwmMap: {10}attribute gidNumber * olcRwmMap: {11}attribute *
dn: olcDatabase={3}mdb,cn=config objectClass: olcDatabaseConfig objectClass: olcMdbConfig olcDatabase: {3}mdb olcDbDirectory: /var/lib/ldap olcSuffix: dc=example,dc=com olcRootDN: gidNumber=0+uidNumber=0,cn=peercred,cn=external,cn=auth olcAccess: {0}to * by dn.exact="cn=admin,ou=local,ou=users,dc=example,dc=com" write by dn.exact="cn=sync,ou=local,ou=users,dc=example,dc=com" write by * break olcAccess: {1}to attrs=userPassword by anonymous auth by self write by * none olcAccess: {2}to * by * read olcDbIndex: objectClass eq,pres olcDbIndex: ou,cn,mail,surname,givenname eq,pres,sub ----------------
So far, so good. A subtree search on "ou=users,dc=example,dc=com" returns both local and remote users.
But when I create the second server "ldap2" with similar configuration and configure refreshAndPersist replication, I run into a problem.
(1) When I configure on "ldap1" server, ---------------- dn: olcOverlay={0}syncprov,olcDatabase={3}mdb,cn=config objectClass: olcOverlayConfig objectClass: olcSyncProvConfig olcOverlay: {0}syncprov ----------------
and on "ldap2" server, ---------------- dn: olcDatabase={3}mdb,cn=config changeType: modify replace: olcSyncrepl olcSyncrepl: {0}rid=301 provider="ldap://ldap1/" bindmethod=simple binddn="cn=sync,ou=local,ou=users,dc=example,dc=com" credentials=secret searchbase="dc=example,dc=com" type=refreshAndPersist retry="5 12 60 +" timeout=1 ----------------
the initial refresh stage fails.
(a) Whith the above configuration, the refresh failes with "(48) Inappropriate authentication", because the bind DN "cn=sync,ou=local,ou=users,dc=example,dc=com" does not have access to the subordinate database.
(b) When I add "cn=sync,ou=local,ou=users,dc=example,dc=com" to the ID assertion list on "ldap1" server, ---------------- dn: olcDatabase={2}ldap,cn=config changeType: modify add: olcDbIDAssertAuthzFrom olcDbIDAssertAuthzFrom: {2}dn.exact:cn=sync,ou=local,ou=users,dc=example,dc=com ----------------
the refresh fails with "(12) Critical extension is unavailable", because Active Directory does not support Sync Request Control.
(c) Even if the remote server supports Sync Request Control, the refresh fails with the message "server sent multiple refreshDone messages? Ending session". The refreshDone messages are sent twice, one for the sperior databese and the other for the subordinate database.
(d) If I delete olcSubordinate attribute and restart slapd on "ldap1" server, ---------------- dn: olcDatabase={2}ldap,cn=config changeType: modify delete: olcSubordinate ----------------
then the refresh stage completes successfully. Once the persistent session is established, I can add olcSubordinate attribute again. ---------------- dn: olcDatabase={2}ldap,cn=config changeType: modify add: olcSubordinate olcSubordinate: TRUE ----------------
When I modify entries in the subordinate database on "ldap1" server, no change notification is sent to "ldap2" server. This is the desired behavior, but if I restart slapd on "ldap1" server, the refresh starts failing again.
(2) When I configure the glue overlay explicitly before the syncprov overlay, as described in "man slapd-config", ---------------- dn: olcOverlay={0}glue,olcDatabase={3}mdb,cn=config objectClass: olcOverlayConfig objectClass: olcConfig olcOverlay: {0}glue
dn: olcOverlay={1}syncprov,olcDatabase={3}mdb,cn=config objectClass: olcOverlayConfig objectClass: olcSyncProvConfig olcOverlay: {1}syncprov ----------------
the refresh stage completes successfully without attempting to search the subordinate database. This is fine because I do not need to synchronize the subordinate database between "ldap1" and "ldap2" servers.
However, when I modify an entry in the subordinate database on "ldap1" server, slapd crashes by segmentation fault. See the attached file for stack trace.
After some research, I found that the cause of the crash is as follows: In syncprov_matchops(), it attempts to get the modified entry with DN = op->o_req_ndn. But since op->o_req_ndn has been rewritten in the rmw overlay, glue_back_select() incorrectly selects the mdb backend, which should be the ldap backend. At this point, op->o_bd->be_private holds a value of type ldapinfo_t, but mdb_entry_get() tries to interpret it as type struct mdb_info, causing a segfault.
In summary, the problem is:
When I configure refreshAndPersist synchronization for a database with a subordinate ldap backend using DN rewriting,
(1) The subordinate database cannot be excluded from both refresh and persistent stage of the synchronization:
When the glue overlay is not explicitly configured: - In the refresh stage, the subordinate database is included in the search. - In the persist stage, the subordinate database is excluded from the synchronization.
When the glue overlay is explicitly configured before the syncprov overlay: - In the refresh stage, the subordinate database is excluded from the search. - In the persist stage, the subordinate database is included in the synchronization.
This seems to be inconsistent.
(2) If the subordinate database is included in the refresh stage, the refresh fails for one of the following reasons: - the syncrepl user is not allowed to access the subordinate database - the remote server does not support Sync Request Control - multiple refreshDone messages are returned
The refresh stage completes successfully if olcSubordinate attribute is deleted from the subordinate database. olcSubordinate attribute can be added again once the persistent session is established, but the refresh stage starts failing again if slapd is restarted.
(3) If the subordinate database is included in the persist stage, modifying entries in the subordinate database causes slapd to crash.