Hi,
We are running the standard openldap-2.2.13 and Berkeley DB 4.2.52 packages on a RHEL 4 server.
Every few weeks, the LDAP service will stop responding to queries, updates etc. The service is still running etc. but it never responds to requests.
The only way to resolve the issue is to stop ldap, run db_recover and then start it again. This is in our test environment (it has happened 4 times recently) but we are looking to go into production soon.
Has anyone experience similar issues on RHEL 4 or have any idea how to prevent this from occurring?
There wasn't anything of note in the logs before the service stopped responding (loglevel 256). I restarted the service (without running db_recover) with loglevel -1. Still had the same unresponsive service, but noticed this in the logs.
Jan 15 12:01:51 linuxtest3 slapd[18755]: daemon: activity on 1 descriptors Jan 15 12:01:51 linuxtest3 slapd[18755]: daemon: new connection on 9 Jan 15 12:01:51 linuxtest3 slapd[18755]: conn=0 fd=9 ACCEPT from IP=136.186.226.57:43999 (IP=0.0.0.0:389) Jan 15 12:01:51 linuxtest3 slapd[18755]: daemon: added 9r Jan 15 12:01:51 linuxtest3 slapd[18755]: daemon: activity on: Jan 15 12:01:51 linuxtest3 slapd[18755]: Jan 15 12:01:51 linuxtest3 slapd[18755]: daemon: select: listen=6 active_threads=0 tvp=NULL Jan 15 12:01:51 linuxtest3 slapd[18755]: daemon: activity on 1 descriptors Jan 15 12:01:51 linuxtest3 slapd[18755]: daemon: activity on: Jan 15 12:01:51 linuxtest3 slapd[18755]: 9r Jan 15 12:01:51 linuxtest3 slapd[18755]: Jan 15 12:01:51 linuxtest3 slapd[18755]: daemon: read activity on 9 Jan 15 12:01:51 linuxtest3 slapd[18755]: connection_get(9) Jan 15 12:01:51 linuxtest3 slapd[18755]: connection_get(9): got connid=0 Jan 15 12:01:52 linuxtest3 slapd[18755]: connection_read(9): checking for input on id=0 Jan 15 12:01:52 linuxtest3 slapd[18755]: ber_get_next on fd 9 failed errno=11 (Resource temporarily unavailable) Jan 15 12:01:52 linuxtest3 slapd[18755]: do_bind Jan 15 12:01:52 linuxtest3 slapd[18755]: daemon: select: listen=6 active_threads=0 tvp=NULL Jan 15 12:01:52 linuxtest3 slapd[18755]: >>> dnPrettyNormal: <cn=bob,dc=swin,dc=edu,dc=au> Jan 15 12:01:52 linuxtest3 slapd[18755]: <<< dnPrettyNormal: <cn=bob,dc=swin,dc=edu,dc=au>, <cn=bob,dc=swin,dc=edu,dc= au> Jan 15 12:01:52 linuxtest3 slapd[18755]: do_bind: version=3 dn="cn=bob,dc=swin,dc=edu,dc=au" method=128 Jan 15 12:01:52 linuxtest3 slapd[18755]: conn=0 op=0 BIND dn="cn=bob,dc=swin,dc=edu,dc=au" method=128 Jan 15 12:01:52 linuxtest3 slapd[18755]: ==> bdb_bind: dn: cn=bob,dc=swin,dc=edu,dc=au Jan 15 12:01:52 linuxtest3 slapd[18755]: bdb_dn2entry("cn=bob,dc=swin,dc=edu,dc=au") Jan 15 12:01:52 linuxtest3 slapd[18755]: => bdb_dn2id( "dc=swin,dc=edu,dc=au" ) Jan 15 12:01:52 linuxtest3 slapd[18755]: <= bdb_dn2id: got id=0x00000006 Jan 15 12:01:52 linuxtest3 slapd[18755]: => bdb_dn2id( "cn=bob,dc=swin,dc=edu,dc=au" ) Jan 15 12:01:52 linuxtest3 slapd[18755]: <= bdb_dn2id: get failed: DB_NOTFOUND: No matching key/data pair found (-3099 0)
thanks,
Daniel