I have now installed openldap 2.4.19 from source (default configuration except for --enable-crypt=yes).
# slapd -V @(#) $OpenLDAP: slapd 2.4.19 (Dec 9 2009 22:46:15) $
But unfortunately, the bug is still there. Nothing has changed at all.. It has crashed 10 times since I did the migration 10 hours ago. It usually crashes 1-2 times at night, and 10-20 times during work hours (when the servers have more load).
When I migrated, I recreated the database by typing: # oldbin/slapcat > db.ldif # newbin/slapadd < db.ldif # (under the new openldap/slapd user) So I can't imagine the database can be corrupt in any way, which I initially thought when I first sent this bug report.
It still dies on the same queries as before, in the middle of iterating through ou=users,.. or ou=groups,.., which both have ~1.8k entries. These queries comes from proftpd, which does a "getent passwd" and "getent group" every time a customer logs in. I tried to reproduce this manually again, launching 8 processes, constantly querying ou=users and ou=groups, 2 using a unix socket locally and 6 using ldaps://hostname/ remotely, but it still won't break down when I do that. This causes *alot* more load than proftpd does, but the crashes seem to only happen "randomly".
Any suggestions on what I can do to figure out this problem? The LDAP server is live and in use in two servers with 2-3k users, so I can't mess with it *too* much.
On Fri, Oct 02, 2009 at 09:28:25PM +0000, quanah@zimbra.com wrote:
--On Friday, October 02, 2009 8:54 PM +0000 hyc@symas.com wrote:
strace is useless. Use gdb and get a trace of all running threads when this occurs.
You also need to test against a current release of OpenLDAP, like OpenLDAP 2.4.18.
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc
Zimbra :: the leader in open source messaging and collaboration