Hello list.
Since a recent upgrade 2.4.12 -> 2.4.13, I'm facing recurrent slapd hanging.
On client side, ldapsearch requests receive this error: error.c:272: ldap_parse_result: Assertion `r != ((void *)0)' failed
I'd expect in this case an automatic switch to slave server, but it doesn't work. Here is my ldap libraries configuration: BASE dc=msr-inria,dc=inria,dc=fr URI ldap://ldap1.msr-inria.inria.fr ldap://ldap2.msr-inria.inria.fr TLS_CACERTDIR /etc/pki/tls/certs TLS_REQCERT demand NETWORK_TIMEOUT 2 TIMEOUT 2 TIMELIMIT 2
On server side, slapd usually shows eating 100, 200 or 300% cpu, which make me think some specific repeated query trigger the issue, making the problem worse when several of them accumulates.
strace on running slapd process shows it's waiting on a futex: [root@etoile main]# strace -p 2769 Process 2769 attached - interrupt to quit futex(0xb6bb4bd8, FUTEX_WAIT, 2774, NULL <unfinished ...>
And gdb shows it waiting in __kernel_vsyscall (gdb) bt #0 0xffffe410 in __kernel_vsyscall () #1 0xb7d385c6 in pthread_join () from /lib/i686/libpthread.so.0 #2 0xb7f23d3f in ldap_pvt_thread_join () from /usr/lib/libldap_r-2.4.so.2 #3 0x0806e1b4 in slapd_daemon () #4 0x0805a507 in main ()
In both case, I think the lack of relevant information is caused by the multithreading nature of slapd, I don't know how to access the exact thread where the problem occurs.
I already tried to regenerate indexes, without results. I dropped the base, and reconstructed it from latest backup, it made the problem temporarily disapear. I didn't found anything in the logs, even with debug level set to 'trace'.
I'm using a bdb backend, with this configuration in slapd.conf: database bdb suffix "dc=msr-inria,dc=inria,dc=fr" rootdn "cn=root,dc=msr-inria,dc=inria,dc=fr" #rootpw root directory /var/lib/ldap/main
cachesize 1000 idlcachesize 1000 checkpoint 256 5
And this one in DB_CONFIG: set_cachesize 0 1048576 0 set_lg_bsize 2097152 set_lg_max 10485760 set_flags DB_LOG_AUTOREMOVE
The full slapd.conf is accessible at http://pastebin.mandriva.com/5801 db_stat -m output is accessible at http://pastebin.mandriva.com/5799
The main database itself is quite small, the ldiff backup is 1.4 only. I also have a log database for syncrepl purpose.
I'm using openldap 2.4.13, with db 4.6.21, on mandriva linux 2008.1, 32 bits system. I'd be happy to provide additional informations if needed.
Guillaume Rousse a écrit :
And gdb shows it waiting in __kernel_vsyscall (gdb) bt #0 0xffffe410 in __kernel_vsyscall () #1 0xb7d385c6 in pthread_join () from /lib/i686/libpthread.so.0 #2 0xb7f23d3f in ldap_pvt_thread_join () from /usr/lib/libldap_r-2.4.so.2 #3 0x0806e1b4 in slapd_daemon () #4 0x0805a507 in main ()
In both case, I think the lack of relevant information is caused by the multithreading nature of slapd, I don't know how to access the exact thread where the problem occurs.
I finally found how to get a better stack trace, it seems to be a locking issue with bdb: http://pastebin.mandriva.com/5804
--On Friday, January 23, 2009 2:50 PM +0100 Guillaume Rousse Guillaume.Rousse@inria.fr wrote:
Guillaume Rousse a écrit :
And gdb shows it waiting in __kernel_vsyscall (gdb) bt # 0 0xffffe410 in __kernel_vsyscall () # 1 0xb7d385c6 in pthread_join () from /lib/i686/libpthread.so.0 # 2 0xb7f23d3f in ldap_pvt_thread_join () from # /usr/lib/libldap_r-2.4.so.2 3 0x0806e1b4 in slapd_daemon () # 4 0x0805a507 in main ()
In both case, I think the lack of relevant information is caused by the multithreading nature of slapd, I don't know how to access the exact thread where the problem occurs.
I finally found how to get a better stack trace, it seems to be a locking issue with bdb: http://pastebin.mandriva.com/5804
Do you have all the BDB 4.6 patches applied?
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
Quanah Gibson-Mount a écrit :
--On Friday, January 23, 2009 2:50 PM +0100 Guillaume Rousse Guillaume.Rousse@inria.fr wrote:
Guillaume Rousse a écrit :
And gdb shows it waiting in __kernel_vsyscall (gdb) bt # 0 0xffffe410 in __kernel_vsyscall () # 1 0xb7d385c6 in pthread_join () from /lib/i686/libpthread.so.0 # 2 0xb7f23d3f in ldap_pvt_thread_join () from # /usr/lib/libldap_r-2.4.so.2 3 0x0806e1b4 in slapd_daemon () # 4 0x0805a507 in main ()
In both case, I think the lack of relevant information is caused by the multithreading nature of slapd, I don't know how to access the exact thread where the problem occurs.
I finally found how to get a better stack trace, it seems to be a locking issue with bdb: http://pastebin.mandriva.com/5804
Do you have all the BDB 4.6 patches applied?
Only patch 1, from the 3 availables at http://www.oracle.com/technology/products/berkeley-db/db/update/4.6.21/patch.... And patch #2 is a good candidate for the current issue, indeed.
--On Friday, January 23, 2009 11:24 AM +0100 Guillaume Rousse Guillaume.Rousse@inria.fr wrote:
Hello list.
Since a recent upgrade 2.4.12 -> 2.4.13, I'm facing recurrent slapd hanging.
(a) build with debugging symbols (CFLAGS=-g)
(b) did you patch 2.4.13 for ITS#5768 and ITS#5841?
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
Quanah Gibson-Mount a écrit :
--On Friday, January 23, 2009 11:24 AM +0100 Guillaume Rousse Guillaume.Rousse@inria.fr wrote:
Hello list.
Since a recent upgrade 2.4.12 -> 2.4.13, I'm facing recurrent slapd hanging.
(a) build with debugging symbols (CFLAGS=-g)
It is, actually, I just forgot to install the package providing the symbols. I'll do before producing a new trace if the problem arise again, as things seems to have settled down a bit after I carefully tuned nss configuration of my park to reduce incoming queries.
(b) did you patch 2.4.13 for ITS#5768 and ITS#5841?
The first one, not the second. According to the description of the problem adressed, I don't think I'm concerned (but I may be wrong).
Hello list,
How I get entries that have been modified from OpenLDAP
It is possible using ldapsearch?
Thanks,
Marcelo José Xavier
marcelo.xavier@caixa.gov.br wrote:
How I get entries that have been modified from OpenLDAP
It is possible using ldapsearch?
Have a look at the accesslog overlay or how to implement a syncrepl client. A very primitive approach is to use a filter with attributes createTimestamp and modifyTimestamp.
Ciao, Michael.
Michael Ströder wrote:
marcelo.xavier@caixa.gov.br wrote:
How I get entries that have been modified from OpenLDAP
It is possible using ldapsearch?
Have a look at the accesslog overlay or how to implement a syncrepl client. A very primitive approach is to use a filter with attributes createTimestamp and modifyTimestamp.
For the latter, only modifyTimestamp is needed; it is set at creation time as well as for all modifications.
openldap-software@openldap.org