Full_Name: Ryan Steele Version: 2.4.23 OS: Ubuntu Server URL: ftp://ftp.openldap.org/incoming/ryan-steele-110215.proxycache-failure.log Submission from: (NULL) (207.106.239.81)
I use back-ldap + proxycache on many of my servers to reduce network traffic and to alleviate load on the masters, as well as to maintain service continuity in the event of a network failure. However, we have recently been noticing an issue where the proxycache database claims that it has the data and that the query is answerable, but fails to read data from any of the indices it thinks the data is at. It happens randomly, and to random entries. We do not cache negative search results, so the cache should never return nentries=0 authoritatively. I can temporarily fix it for some broken users by restarting slapd and clearing the cache (i.e., pcachePersist is set to FALSE), but inevitably others stop working (or sometimes, the same users stop working). When this happens, most entries are still served from the cache just fine, but the entries that aren't never will unless slapd is restarted. I have tested this with 2.4.17, 2.4.21, and 2.4.23, using the amd64 architecture, and with libdb4.6 and libdb4.7.
Included below is the slapd.conf I use on my back-ldap + proxycache nodes, an example of the behavior using ldapsearch, and the log messages during a failed search in which using log level 16383:
## Proxycache slapd configuration
# Schema include /etc/ldap/schema/core.schema include /etc/ldap/schema/collective.schema include /etc/ldap/schema/corba.schema include /etc/ldap/schema/cosine.schema include /etc/ldap/schema/duaconf.schema include /etc/ldap/schema/dyngroup.schema include /etc/ldap/schema/inetorgperson.schema include /etc/ldap/schema/java.schema include /etc/ldap/schema/misc.schema include /etc/ldap/schema/openldap.schema include /etc/ldap/schema/ppolicy.schema include /etc/ldap/schema/examplecom.schema include /etc/ldap/schema/rfc2307bis.schema include /etc/ldap/schema/samba.schema include /etc/ldap/schema/apple_auxiliary.schema include /etc/ldap/schema/apple.schema
# System pidfile /var/run/slapd/slapd.pid argsfile /var/run/slapd/slapd.args loglevel stats TLSCACertificateFile /etc/ldap/ssl/certs/cacert.pem TLSCertificateFile /etc/ldap/ssl/certs/openldap.cert.pem TLSCertificateKeyFile /etc/ldap/ssl/keys/openldap.key.pem TLSVerifyClient never
# Modules modulepath /usr/lib/ldap moduleload back_ldap.la moduleload back_hdb.la moduleload pcache.la
# Back-LDAP database ldap uri "ldap://ldapmaster.example.com" suffix "dc=example,dc=com" rootdn "cn=admin,dc=example,dc=com" rootpw SECRET tls start
# ACLs access to attrs=userPassword by tls_ssf=128 ssf=128 self write by tls_ssf=128 ssf=128 anonymous auth by tls_ssf=128 ssf=128 group/groupOfURLs/Member="cn=ops,ou=Groups,dc=example,dc=com" write by tls_ssf=128 ssf=128 * compare access to * by tls_ssf=128 ssf=128 self write by tls_ssf=128 ssf=128 group/groupOfURLs/Member="cn=ops,ou=Groups,dc=example,dc=com" write by tls_ssf=128 ssf=128 * read
# ProxyCache overlay pcache proxycache hdb 500000 1 5000 86400 directory /var/lib/ldap/proxycache
index cn eq index departmentName eq index entryCSN eq index entryUUID eq index gidNumber eq index mail eq index member eq index memberUid eq index objectClass eq index pcacheQueryid eq index uid eq index uidNumber eq index uniqueMember eq
proxycachequeries 1000000 proxyattrset 0 apple-user-homeDirectory blogCategory cn dateCreated departmentName departmentNumber description displayColor employeeNumber gecos getsPages gidNumber givenName homeDirectory htaccessPasswd isAvailable isPhoneOperator lastAdminVisit loginShell mail manager member memberUid mobile mobileEmail numTickets objectClass ou phoneExtension sn sortOrder uid uidNumber uniqueMember userPassword
proxytemplate (blogCategory=) 0 86400 proxytemplate (cn=) 0 86400 proxytemplate (dateCreated=) 0 86400 proxytemplate (departmentName=) 0 86400 proxytemplate (departmentNumber=) 0 86400 proxytemplate (description=) 0 86400 proxytemplate (displayColor=) 0 86400 proxytemplate (employeeNumber=) 0 86400 proxytemplate (gecos=) 0 86400 proxytemplate (getsPages=) 0 86400 proxytemplate (gidNumber=) 0 86400 proxytemplate (givenName=) 0 86400 proxytemplate (homeDirectory=) 0 86400 proxytemplate (apple-user-homeDirectory=) 0 86400 proxytemplate (htaccessPasswd=) 0 86400 proxytemplate (isAvailable=) 0 86400 proxytemplate (isPhoneOperator=) 0 86400 proxytemplate (lastAdminVisit=) 0 86400 proxytemplate (loginShell=) 0 86400 proxytemplate (mail=) 0 86400 proxytemplate (manager=) 0 86400 proxytemplate (member=) 0 86400 proxytemplate (memberUid=) 0 86400 proxytemplate (memberURL=) 0 86400 proxytemplate (mobile=) 0 86400 proxytemplate (mobileEmail=) 0 86400 proxytemplate (numTickets=) 0 86400 proxytemplate (objectClass=) 0 86400 proxytemplate (ou=) 0 86400 proxytemplate (phoneExtension=) 0 86400 proxytemplate (sn=) 0 86400 proxytemplate (sortOrder=) 0 86400 proxytemplate (uid=) 0 86400 proxytemplate (uidNumber=) 0 86400 proxytemplate (uniqueMember=) 0 86400 proxytemplate (|(memberUid=)(member=)) 0 86400 proxytemplate (|(memberUid=)(uniqueMember=)) 0 86400 proxytemplate (&(objectClass=)(uid=)) 0 86400 proxytemplate (&(objectClass=)(memberUid=)) 0 86400 proxytemplate (&(objectClass=)(uniqueMember=)) 0 86400 proxytemplate (&(objectClass=)(uidNumber=)) 0 86400 proxytemplate (&(objectClass=)(gidNumber=)) 0 86400 proxytemplate (&(objectClass=)(|(memberUid=)(member=))) 0 86400 proxytemplate (&(objectClass=)(|(memberUid=)(uniqueMember=))) 0 86400 proxytemplate (&(objectClass=)(member=)) 0 86400 proxytemplate (&(objectClass=)(cn=)) 0 86400 proxytemplate (&(|(objectClass=)(objectClass=))(uid=)) 0 86400
## example of the failures using ldapsearch
bash:~# for i in `seq 1 14`; do echo "PROCESSING jdoe$i"; ldapsearch -x -H ldaps://localhost -LLL -b ou=Users,dc=example,dc=com '(&(|(objectClass=examplecomEmployee)(objectClass=examplecomUtilityUser))(uid=jdoe'$i'))' uid; sleep 1; done PROCESSING jdoe1 dn: uid=jdoe1,ou=Users,dc=example,dc=com uid: jdoe1
PROCESSING jdoe2 dn: uid=jdoe2,ou=Users,dc=example,dc=com uid: jdoe2
PROCESSING jdoe3 dn: uid=jdoe3,ou=Users,dc=example,dc=com uid: jdoe3
PROCESSING jdoe4 dn: uid=jdoe4,ou=Users,dc=example,dc=com uid: jdoe4
PROCESSING jdoe5 PROCESSING jdoe6 PROCESSING jdoe7 PROCESSING jdoe8 PROCESSING jdoe9 PROCESSING jdoe10 PROCESSING jdoe11 PROCESSING jdoe12 PROCESSING jdoe13 PROCESSING jdoe14 bash:~#
A log file (with log level set to 16383) showing what happens when the cache is queried and it responds with "QUERY ANSWERABLE", and then fails to read data from any of the indices referenced, can be found at ftp://ftp.openldap.org/incoming/ryan-steele-110215.proxycache-failure.log. It seems similar to ITS#6242, but my version of pcache.c, at least in the 2.4.21 and 2.4.23 versions of OpenLDAP, definitely contain that patch, as I can see it in the source (the manageDSAit control). Please let me know if you need any other information to debug this problem (e.g., specific variables from a debugger run, copy of a proxycache database experiencing the problem, etc.)