Hi
I did an upgrade of two ldap server (master/slave) from 2.4.21 to 2.4.28 two days ago. And today, the master crashed, if I do an ldapsearch:
root@ldap-master001 /]#---> ldapsearch -ZZ -hlocalhost -d-1 ldap_create ldap_url_parse_ext(ldap://localhost) ldap_extended_operation_s ldap_extended_operation ldap_send_initial_request ldap_new_connection 1 1 0 ldap_int_open_connection ldap_connect_to_host: TCP localhost:389 ldap_new_socket: 3 ldap_prepare_socket: 3 ldap_connect_to_host: Trying ::1 389 ldap_pvt_connect: fd: 3 tm: -1 async: 0 ldap_open_defconn: successful ldap_send_server_request ber_scanf fmt ({it) ber: ber_dump: buf=0x13c3910 ptr=0x13c3910 end=0x13c392f len=31 0000: 30 1d 02 01 01 77 18 80 16 31 2e 33 2e 36 2e 31 0....w...1.3.6.1 0010: 2e 34 2e 31 2e 31 34 36 36 2e 32 30 30 33 37 .4.1.1466.20037 ber_scanf fmt ({) ber: ber_dump: buf=0x13c3910 ptr=0x13c3915 end=0x13c392f len=26 0000: 77 18 80 16 31 2e 33 2e 36 2e 31 2e 34 2e 31 2e w...1.3.6.1.4.1. 0010: 31 34 36 36 2e 32 30 30 33 37 1466.20037 ber_flush2: 31 bytes to sd 3 0000: 30 1d 02 01 01 77 18 80 16 31 2e 33 2e 36 2e 31 0....w...1.3.6.1 0010: 2e 34 2e 31 2e 31 34 36 36 2e 32 30 30 33 37 .4.1.1466.20037 ldap_write: want=31, written=31 0000: 30 1d 02 01 01 77 18 80 16 31 2e 33 2e 36 2e 31 0....w...1.3.6.1 0010: 2e 34 2e 31 2e 31 34 36 36 2e 32 30 30 33 37 .4.1.1466.20037 ldap_result ld 0x13bb660 msgid 1 wait4msg ld 0x13bb660 msgid 1 (infinite timeout) wait4msg continue ld 0x13bb660 msgid 1 all 1 ** ld 0x13bb660 Connections: * host: localhost port: 389 (default) refcnt: 2 status: Connected last used: Thu Jan 26 21:43:50 2012
** ld 0x13bb660 Outstanding Requests: * msgid 1, origid 1, status InProgress outstanding referrals 0, parent count 0 ld 0x13bb660 request count 1 (abandoned 0) ** ld 0x13bb660 Response Queue: Empty ld 0x13bb660 response count 0 ldap_chkResponseList ld 0x13bb660 msgid 1 all 1 ldap_chkResponseList returns ld 0x13bb660 NULL ldap_int_select
It hangs at ldap_int_select (There's nothing in the slapd debug log) Same problem again after a few seconds, if I kill the daemon and start it again. (The daemon crashed, only a kill -9 helped)
And now, after one hour of debug, the same problem occurred on our first slave server....
Has someone similar problems, or can somone helps me? The next thing I'll do is go back to 2.4.21, maybe that helps....
Thanks
--On Friday, January 27, 2012 12:07 AM +0100 Raffael Sahli public@raffaelsahli.com wrote:
Hi
I did an upgrade of two ldap server (master/slave) from 2.4.21 to 2.4.28 two days ago. And today, the master crashed, if I do an ldapsearch:
Has someone similar problems, or can somone helps me? The next thing I'll do is go back to 2.4.21, maybe that helps....
Can you try using the current code from RE24 instead and see if it fixes your issue? 2.4.29 is about to release, so it would be good to know if that fixes your issue.
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
On 01/27/2012 12:44 AM, Quanah Gibson-Mount wrote:
--On Friday, January 27, 2012 12:07 AM +0100 Raffael Sahli public@raffaelsahli.com wrote:
Hi
I did an upgrade of two ldap server (master/slave) from 2.4.21 to 2.4.28 two days ago. And today, the master crashed, if I do an ldapsearch:
Has someone similar problems, or can somone helps me? The next thing I'll do is go back to 2.4.21, maybe that helps....
Can you try using the current code from RE24 instead and see if it fixes your issue? 2.4.29 is about to release, so it would be good to know if that fixes your issue.
--Quanah
OK, It was not the upgrade (probably not), the problem is an ldap database.
The slapd daemon is now working without problems, but I need this ldap database ;). If I add the database again, the daemon crashes again after a few seconds (200-400s).
The same setup works on a slave server (same ldap database, same ACL).
What could that be? The last slapd debug entries just show lines like:
daemon: activity on 1 descriptor daemon: activity on: 25r daemon: read active on 25 daemon: epoll: listen=7 active_threads=0 tvp=zero daemon: epoll: listen=8 active_threads=0 tvp=zero daemon: activity on 1 descriptor daemon: activity on: slap_listener_activate(7): daemon: epoll: listen=7 busy daemon: epoll: listen=8 active_threads=0 tvp=zero daemon: activity on 1 descriptor daemon: activity on: 30r daemon: read active on 30 daemon: epoll: listen=7 busy daemon: epoll: listen=8 active_threads=0 tvp=zero daemon: activity on 1 descriptor daemon: activity on: 39r daemon: read active on 39 daemon: epoll: listen=7 busy daemon: epoll: listen=8 active_threads=0 tvp=zero
process is
Could that be a strange acl thing (long lookup time or something like that, We have acl rules pointing to the remote server), but a daemon crash? And a slave works with the same ACL...
Thanks
--On Friday, February 03, 2012 9:25 AM +0100 Raffael Sahli public@raffaelsahli.com wrote:
On 01/27/2012 12:44 AM, Quanah Gibson-Mount wrote:
--On Friday, January 27, 2012 12:07 AM +0100 Raffael Sahli public@raffaelsahli.com wrote:
Hi
I did an upgrade of two ldap server (master/slave) from 2.4.21 to 2.4.28 two days ago. And today, the master crashed, if I do an ldapsearch:
Has someone similar problems, or can somone helps me? The next thing I'll do is go back to 2.4.21, maybe that helps....
Can you try using the current code from RE24 instead and see if it fixes your issue? 2.4.29 is about to release, so it would be good to know if that fixes your issue.
--Quanah
OK, It was not the upgrade (probably not), the problem is an ldap database.
The slapd daemon is now working without problems, but I need this ldap database ;). If I add the database again, the daemon crashes again after a few seconds (200-400s).
The same setup works on a slave server (same ldap database, same ACL).
What could that be? The last slapd debug entries just show lines like:
You need to enable core files, and get a gdb backtrace with a slapd binary that has debugging symbols. Log-level output isn't going to help any.
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
openldap-technical@openldap.org