Openldap 2.4.28 master/slave crash after upgrade? - openldap-technical

26 Jan 2012


      Hi
I did an upgrade of two ldap server (master/slave) from 2.4.21 to 2.4.28 
two days ago.
And today, the master crashed, if I do an ldapsearch:
root@ldap-master001 /]#---> ldapsearch -ZZ -hlocalhost -d-1
ldap_create
ldap_url_parse_ext(ldap://localhost)
ldap_extended_operation_s
ldap_extended_operation
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP localhost:389
ldap_new_socket: 3
ldap_prepare_socket: 3
ldap_connect_to_host: Trying ::1 389
ldap_pvt_connect: fd: 3 tm: -1 async: 0
ldap_open_defconn: successful
ldap_send_server_request
ber_scanf fmt ({it) ber:
ber_dump: buf=0x13c3910 ptr=0x13c3910 end=0x13c392f len=31
   0000:  30 1d 02 01 01 77 18 80  16 31 2e 33 2e 36 2e 31   
0....w...1.3.6.1
   0010:  2e 34 2e 31 2e 31 34 36  36 2e 32 30 30 33 37      
.4.1.1466.20037
ber_scanf fmt ({) ber:
ber_dump: buf=0x13c3910 ptr=0x13c3915 end=0x13c392f len=26
   0000:  77 18 80 16 31 2e 33 2e  36 2e 31 2e 34 2e 31 2e   
w...1.3.6.1.4.1.
   0010:  31 34 36 36 2e 32 30 30  33 37                     1466.20037
ber_flush2: 31 bytes to sd 3
   0000:  30 1d 02 01 01 77 18 80  16 31 2e 33 2e 36 2e 31   
0....w...1.3.6.1
   0010:  2e 34 2e 31 2e 31 34 36  36 2e 32 30 30 33 37      
.4.1.1466.20037
ldap_write: want=31, written=31
   0000:  30 1d 02 01 01 77 18 80  16 31 2e 33 2e 36 2e 31   
0....w...1.3.6.1
   0010:  2e 34 2e 31 2e 31 34 36  36 2e 32 30 30 33 37      
.4.1.1466.20037
ldap_result ld 0x13bb660 msgid 1
wait4msg ld 0x13bb660 msgid 1 (infinite timeout)
wait4msg continue ld 0x13bb660 msgid 1 all 1
** ld 0x13bb660 Connections:
* host: localhost  port: 389  (default)
   refcnt: 2  status: Connected
   last used: Thu Jan 26 21:43:50 2012
** ld 0x13bb660 Outstanding Requests:
  * msgid 1,  origid 1, status InProgress
    outstanding referrals 0, parent count 0
   ld 0x13bb660 request count 1 (abandoned 0)
** ld 0x13bb660 Response Queue:
    Empty
   ld 0x13bb660 response count 0
ldap_chkResponseList ld 0x13bb660 msgid 1 all 1
ldap_chkResponseList returns ld 0x13bb660 NULL
ldap_int_select
It hangs at ldap_int_select (There's nothing in the slapd debug log)
Same problem again after a few seconds, if I kill the daemon and start 
it again.
(The daemon crashed, only a kill -9 helped)
And now, after one hour of debug, the same problem occurred on our first 
slave server....
Has someone similar problems, or can somone helps me?
The next thing I'll do is go back to 2.4.21, maybe that helps....
Thanks
-- 
Raffael Sahli
public@raffaelsahli.com