--On October 7, 2009 10:53:27 AM +0200 Peter Mogensen apm@mutex.dk wrote:
Hi,
I found the reason that slapd was hanging at startup. It turned out to be a schema, which hadn't been properly replicated after being dynamicly added. So not replication is actually moving entries. However... it seems to constantly loose connection (which may be why the schema sometimes fails to replicate on load).
The setup is 2 mirrormode servers (slapd 2.4.17). Server 1 has the database and is trying to replicate it to Server 2 which was empty from start.
I have syncrepl for both cn=config and for the actual database. Which means that I should see 4 connections (2 each way) between server 1 and 2. But the last connection (server2->server1) seems to open and close constantly.
On server 2 I see repeated:
Oct 7 09:47:14 s02 slapd[26723]: do_syncrepl: rid=001 rc -1 retrying Oct 7 09:47:28 s02 slapd[26723]: do_syncrep2: rid=003 (-1) Can't contact LDAP server Oct 7 09:47:28 s02 slapd[26723]: do_syncrepl: rid=003 rc -1 retrying Oct 7 09:48:49 s02 slapd[26723]: do_syncrep2: rid=003 (-1) Can't contact LDAP server
When Adding olcLogLevel: conns sync trace none I se the logmessages I would expect mixed with a lot of these:
Oct 7 10:41:52 s02 slapd[26723]: slap_sl_malloc of 48 bytes failed, using ch_malloc Oct 7 10:41:52 s02 slapd[26723]: slap_sl_malloc of 40 bytes failed, using ch_malloc
... coming in burts with varying number of bytes. However, the machine doesn't look like it's running out of mem.
Unable to malloc means your system is running out of memory. That's bad.
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration