We have a 3-way multimaster configuration running on CentOS 5.11, OpenLDAP 2.4.40. All three have been up for years, until the other day:
Slapd is running on two of the three (server names: ldapserver1, ldapserver2, and ldapserver3). Slapd stopped and won’t restart on ldapserver2.
From Logs on ldapserver2:
May 10 04:02:13 gp42-admin4 slapd[4541]: slapd shutdown: waiting for 0 operations/tasks to finish
May 10 04:02:19 gp42-admin4 slapd[15633]: @(#) $OpenLDAP: slapd 2.4.40 (Sep 30 2014 16:49:45) $#012#011clement@localhost.localdomain:/home/clement/build/BUILD/openldap-2.4.40/servers/slapd
May 10 04:02:19 gp42-admin4 slapd[15633]: nss-ldap: do_open: do_start_tls failed:stat=-1
May 10 04:02:19 gp42-admin4 slapd[15633]: nss_ldap: reconnected to LDAP server ldap://ldapserver1.example.com
May 10 04:02:21 gp42-admin4 slapd[15634]: bdb_db_open: database "cn=accesslog": database already in use.
May 10 04:02:21 gp42-admin4 slapd[15634]: backend_startup_one (type=bdb, suffix="cn=accesslog"): bi_db_open failed! (-1)
May 10 04:02:21 gp42-admin4 slapd[15634]: slapd stopped.
May 10 04:02:22 gp42-admin4 slapd[4541]: slapd stopped.
When attempting to restart slapd on server2:
May 13 10:13:54 gp42-admin4 slapd[12085]: @(#) $OpenLDAP: slapd 2.4.40 (Sep 30 2014 16:49:45) $#012#011clement@localhost.localdomain:/home/clement/build/BUILD/openldap-2.4.40/servers/slapd
May 13 10:13:54 gp42-admin4 slapd[12085]: nss-ldap: do_open: do_start_tls failed:stat=-1
May 13 10:13:54 gp42-admin4 slapd[12085]: nss_ldap: reconnected to LDAP server ldap://ldapserver1.example.com
May 13 10:13:56 gp42-admin4 slapd[12086]: slapd starting
May 13 10:13:56 gp42-admin4 slapd[12086]: do_syncrep2: rid=002 (4096) Content Sync Refresh Required
May 13 10:13:56 gp42-admin4 slapd[12086]: do_syncrep2: rid=001 (4096) Content Sync Refresh Required
May 13 10:13:57 gp42-admin4 slapd[12086]: => bdb_idl_insert_key: c_put id failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995)
May 13 10:13:57 gp42-admin4 slapd[12086]: => bdb_dn2id_add 0xfc6: parent (cn=accesslog) insert failed: -30995
May 13 10:13:57 gp42-admin4 slapd[12086]: => bdb_idl_delete_key: c_del id failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995)
May 13 10:13:57 gp42-admin4 slapd[12086]: => bdb_dn2id_delete 0xf50: parent (cn=accesslog) delete failed: -30995
May 13 10:15:55 gp42-admin4 slapd[12106]: @(#) $OpenLDAP: slapd 2.4.40 (Sep 30 2014 16:49:45) $#012#011clement@localhost.localdomain:/home/clement/build/BUILD/openldap-2.4.40/servers/slapd
May 13 10:15:55 gp42-admin4 slapd[12106]: nss-ldap: do_open: do_start_tls failed:stat=-1
May 13 10:15:55 gp42-admin4 slapd[12106]: nss_ldap: reconnected to LDAP server ldap://ldapserver1.example.com
May 13 10:15:55 gp42-admin4 slapd[12106]: bdb_db_open: database "dc=example,dc=ldap": unclean shutdown detected; attempting recovery.
May 13 10:15:57 gp42-admin4 slapd[12106]: bdb_db_open: database "cn=accesslog": unclean shutdown detected; attempting recovery.
May 13 10:15:58 gp42-admin4 slapd[12106]: slapd starting
May 13 10:28:49 gp42-admin4 slapd[12255]: @(#) $OpenLDAP: slapd 2.4.40 (Sep 30 2014 16:49:45) $#012#011clement@localhost.localdomain:/home/clement/build/BUILD/openldap-2.4.40/servers/slapd
May 13 10:28:49 gp42-admin4 slapd[12255]: nss-ldap: do_open: do_start_tls failed:stat=-1
May 13 10:28:49 gp42-admin4 slapd[12255]: nss_ldap: reconnected to LDAP server ldap://ldapserver1.example.com
May 13 10:28:50 gp42-admin4 slapd[12255]: bdb_db_open: database "dc=example,dc=com": unclean shutdown detected; attempting recovery.
May 13 10:28:50 gp42-admin4 slapd[12255]: bdb_db_open: database "cn=accesslog": unclean shutdown detected; attempting recovery.
May 13 10:28:52 gp42-admin4 slapd[12255]: slapd starting
May 13 10:29:24 gp42-admin4 slapd[12264]: @(#) $OpenLDAP: slapd 2.4.40 (Sep 30 2014 16:49:45) $#012#011clement@localhost.localdomain:/home/clement/build/BUILD/openldap-2.4.40/servers/slapd
May 13 10:29:24 gp42-admin4 slapd[12264]: nss-ldap: do_open: do_start_tls failed:stat=-1
May 13 10:29:24 gp42-admin4 slapd[12264]: nss_ldap: reconnected to LDAP server ldap://ldapserver1.example.com
May 13 10:29:24 gp42-admin4 slapd[12264]: bdb_db_open: database "dc=example,dc=ldap": unclean shutdown detected; attempting recovery.
May 13 10:29:24 gp42-admin4 slapd[12264]: bdb_db_open: database "cn=accesslog": unclean shutdown detected; attempting recovery.
May 13 10:29:24 gp42-admin4 slapd[12264]: slapd starting
May 13 10:29:53 gp42-admin4 slapd[12280]: @(#) $OpenLDAP: slapd 2.4.40 (Sep 30 2014 16:49:45) $#012#011clement@localhost.localdomain:/home/clement/build/BUILD/openldap-2.4.40/servers/slapd
May 13 10:29:53 gp42-admin4 slapd[12280]: nss-ldap: do_open: do_start_tls failed:stat=-1
May 13 10:29:53 gp42-admin4 slapd[12280]: nss_ldap: reconnected to LDAP server ldap://ldapserver1.example.com
May 13 10:29:53 gp42-admin4 slapd[12280]: bdb_db_open: database "dc=example,dc=ldap": unclean shutdown detected; attempting recovery.
May 13 10:29:53 gp42-admin4 slapd[12280]: bdb_db_open: database "cn=accesslog": unclean shutdown detected; attempting recovery.
May 13 10:29:53 gp42-admin4 slapd[12280]: slapd starting
May 13 10:32:35 gp42-admin4 slapd[12345]: @(#) $OpenLDAP: slapd 2.4.40 (Sep 30 2014 16:49:45) $#012#011clement@localhost.localdomain:/home/clement/build/BUILD/openldap-2.4.40/servers/slapd
Attempting to restart slapd from the command-line:
5735ed50 slapd starting
5735ed50 => bdb_entry_get: ndn: "cn=accesslog"
5735ed50 => bdb_entry_get: oc: "(null)", at: "(null)"
5735ed50 bdb_idl_fetch_key: %cn=accesslog
5735ed50 bdb_idl_fetch_key: [b49d1940]
5735ed50 bdb_idl_fetch_key:
5735ed50 send_ldap_result: err=0 matched="" text=""
5735ed50 => bdb_entry_get: ndn: "dc=example,dc=com"
5735ed50 => bdb_entry_get: oc: "(null)", at: "contextCSN"
ldap_build_search_req ATTRS: reqDN reqType reqMod reqNewRDN reqDeleteOldRDN reqNewSuperior entryCSN
ldap_build_search_req ATTRS: reqDN reqType reqMod reqNewRDN reqDeleteOldRDN reqNewSuperior entryCSN
=> ldap_bv2dn(uid=jdoe,ou=Users,dc=example,dc=com,0)
<= ldap_bv2dn(uid=jdoe,ou=Users,dc=example,dc=com)=0
=> ldap_dn2bv(272)
<= ldap_dn2bv(uid=jdoe,ou=Users,dc=example,dc=com)=0
=> ldap_dn2bv(272)
<= ldap_dn2bv(uid=jdoe,ou=Users,dc=example,dc=com)=0
=> ldap_bv2dn(uid=jdoe,ou=Users,dc=example,dc=com,0)
<= ldap_bv2dn(uid=jdoe,ou=Users,dc=example,dc=com)=0
=> ldap_dn2bv(272)
<= ldap_dn2bv(uid=jdoe,ou=Users,dc=example,dc=com)=0
=> ldap_bv2dn(uid=jdoe,ou=Users,dc=example,dc=com,0)
<= ldap_bv2dn(uid=jdoe,ou=Users,dc=example,dc=com)=0
=> ldap_dn2bv(272)
<= ldap_dn2bv(uid=jdoe,ou=Users,dc=example,dc=com)=0
5735ed50 => bdb_entry_get: ndn: "uid=jdoe,ou=Users,dc=example,dc=com"
5735ed50 => bdb_entry_get: oc: "(null)", at: "(null)"
slapd: search.c:1125: oc_filter: Assertion `f != ((void *)0)' failed.
Aborted
I have run db_recover on the dbase(s) on ldapserver2 but to no avail.
Does anyone have any suggestions?
Thank you in advance for any assistance.
John D. Borresen (Dave)
Linux/Unix Systems Administrator
MIT Lincoln Laboratory
Humanitarian Assistance and Disaster Relief (HADR) Systems
244 Wood St
Lexington, MA 02420
Email: john.borresen@ll.mit.edu