Sudden drop in cache efficiency - openldap-technical

9 Sep 2009


      Hi list,
I'm currently monitoring the cache efficiency in a BerkeleyDB backend, 
as examined with db_stat -m. About 24 hours after starting slapd I 
noticed a significant drop in several indexes, while the general 
activity did not change much at that time. Also, the systems using the 
service did not experience any reduction in availability or response. 
The cache efficiency in percentage remains at 99%.
Its MirrorMode peer experienced a similar drop at the exact same time, 
though for fewer indexes.
The affected nodes are in a refreshAndPersist, retry="60 +" MirrorMode 
setup. Backend database is configured as hdb. The binaries in use are 
the latest RPMs from Buchan Milne (openldap2.4-servers-2.4.17-3.rhel5, 
which comes packed with its own BerkeleyDB 4.7 libraries), running on a 
RHEL5.3 system.
The graphs are available at http://www.ruberg.no/tmp/slapd.html. The 
drops occured just before 15:00.
The nodes are in the same IP network without any routers or firewalls 
in-between. Replication between the nodes works flawlessly.
Is this kind of drop normal behaviour? Has slapd stopped asking its 
backend (or peer) for data and started serving everything from its own 
buffer? Since the observations correlate between the active and the 
passive node I currently suspect this involves the replication mechanism(s).
(The reason some indexes are not currently graphed is that the numbers 
read from db_stat are suffixed with SI units when above 10 million, and 
the monitoring tool doesn't account for that yet.)
Thanks for any pointers and ideas,
-- 
Bjørn