I have two totally separate openldap 2.4 installations, both are live. One is at work (roark) and the other is at home (missioncontrol).
On the one on roark when I run slapcat it errors, why is that?:
[root@roark ~]# slapcat -v -l /root/backup.ldif -b "dc=mdah,dc=state,dc=ms,dc=us" bdb_db_open: database "dc=mdah,dc=state,dc=ms,dc=us": unclean shutdown detected; attempting recovery. bdb_db_open: database "dc=mdah,dc=state,dc=ms,dc=us": recovery skipped in read-only mode. Run manual recovery if errors are encountered. bdb_db_open: database "dc=mdah,dc=state,dc=ms,dc=us": alock_recover failed bdb_db_close: database "dc=mdah,dc=state,dc=ms,dc=us": alock_close failed backend_startup_one: bi_db_open failed! (-1) slap_startup failed
however, it runs fine on missioncontrol:
[root@missioncontrol ~]# slapcat -v -l /root/backup.ldif -b "dc=squeezer,dc=net" bdb_monitor_db_open: monitoring disabled; configure monitor database to enable # id=00000001 # id=00000002 # id=00000003 # id=00000004 # id=00000005 # id=00000006 # id=00000007 # id=00000008 # id=00000009 # id=0000000a # id=0000000b # id=0000000c # id=0000000d # id=0000000e # id=0000000f # id=00000010 # id=00000011 # id=00000012 # id=00000013 # id=00000014 # id=00000015 # id=00000016 # id=00000017 # id=00000018 # id=00000019 # id=0000001a # id=0000001b # id=0000001c # id=0000001d # id=0000001e # id=0000001f # id=00000020 # id=00000021 # id=00000022 # id=00000023 # id=00000024 # id=00000025 # id=00000026 # id=00000027 # id=00000028
both systems have the same configuration other then one is dc=squeezer,dc=net, and the other is dc=mdah,dc=state,dc=ms,dc=us. but other then that, their slapd.conf's are the same. On roark, I can get a slapcat-like dump with:
ldapsearch -v -x -h roark.mdah.state.ms.us -D "cn=Manager,dc=mdah,dc=state,dc=ms,dc=us" -w xxxxxxxxx + "*"
but I'd also like to have a slapcat dump as a secondary backup.
On Mon, 22 Jun 2009, Adam Williams wrote:
On the one on roark when I run slapcat it errors, why is that?:
[root@roark ~]# slapcat -v -l /root/backup.ldif -b "dc=mdah,dc=state,dc=ms,dc=us" bdb_db_open: database "dc=mdah,dc=state,dc=ms,dc=us": unclean shutdown detected; attempting recovery. bdb_db_open: database "dc=mdah,dc=state,dc=ms,dc=us": recovery skipped in read-only mode. Run manual recovery if errors are encountered. bdb_db_open: database "dc=mdah,dc=state,dc=ms,dc=us": alock_recover failed bdb_db_close: database "dc=mdah,dc=state,dc=ms,dc=us": alock_close failed backend_startup_one: bi_db_open failed! (-1) slap_startup failed
I'm not sure if this is *the* problem for your situation, but it can certainly be *a* problem: if you run slapd as a non-root user or with the -U option to change its user id, then you should be running slapcat as that same user.
Why? Because all the programs that open a Sleepycat/Berkeley DB environment should be run as the same user. Otherwise, a transaction log file may be created by the wrong user, making it inaccessable by the other user, which will cause a database panic. Yes, even a (read-only) slapcat process will create transaction log records. It only happens if the transaction log is close to rolling over to the next file, making it a small window, but I saw it happen multiple times with a different project using BDB, so I know lightening can strike repeatedly.
If this is what happened then slapd will have died and you'll need to manually chown the transaction log files to the correct user.
The other thought is that the alock subsystem mentioned in the error messages depends on being able to hold kernel locks (fcntl() or lockf()) on a file in the BDB environment directory. If the filesystem where that directory is located doesn't support file locks (NFS?) or the system has a hard limit on the number of locks allocated, then this may fail. (But I would expect you to see those failures during slapd startup too...)
Philip Guenther
Philip Guenther wrote:
I'm not sure if this is *the* problem for your situation, but it can certainly be *a* problem: if you run slapd as a non-root user or with the -U option to change its user id, then you should be running slapcat as that same user.
Why? Because all the programs that open a Sleepycat/Berkeley DB environment should be run as the same user. Otherwise, a transaction log file may be created by the wrong user, making it inaccessable by the other user, which will cause a database panic. Yes, even a (read-only) slapcat process will create transaction log records. It only happens if the transaction log is close to rolling over to the next file, making it a small window, but I saw it happen multiple times with a different project using BDB, so I know lightening can strike repeatedly.
If this is what happened then slapd will have died and you'll need to manually chown the transaction log files to the correct user.
The other thought is that the alock subsystem mentioned in the error messages depends on being able to hold kernel locks (fcntl() or lockf()) on a file in the BDB environment directory. If the filesystem where that directory is located doesn't support file locks (NFS?) or the system has a hard limit on the number of locks allocated, then this may fail. (But I would expect you to see those failures during slapd startup too...)
Philip Guenther
slapd is running as the user ldap. the user ldap is disabled anyway, it's shell is set to /bin/false. it's just an account that fedora uses to give ldap.ldap ownership to /var/lib/ldap. slapd hasn't died however:
[root@roark ~]# /etc/rc.d/init.d/ldap status slapd (pid 26873) is running... [root@roark ~]# ps axuw|grep slapd ldap 26873 0.1 0.4 723628 17320 ? Ssl May07 130:45 /usr/sbin/slapd -h ldap:/// -u ldap
the filesystem of both servers is strictly ext3, and nothing special on them (no LVM, truecrypt, NFS, etc), just /dev/sda3 mounted as /, /dev/sda2 as /boot, and /dev/sda1 is swap. I'm not sure how to determine the hard lock limit, but its whatever fedora's default is, which should be enough, i'm not running into any other problems on the server and it also runs named, postfix, samba, http, dovecot, etc.
I could restart slapd, but I'm worried that it wouldn't start up properly, which isn't that big of a deal since I have the ldapsearch backup of it and it's trivial to restore from it, but I'd just like to fix this problem if possible instead of restoring from the ldapsearch backup.
--On Monday, June 22, 2009 4:31 PM -0500 Adam Williams awilliam@mdah.state.ms.us wrote:
I have two totally separate openldap 2.4 installations, both are live. One is at work (roark) and the other is at home (missioncontrol).
Are they the same OpenLDAP 2.4 release?
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
--On Monday, June 22, 2009 9:21 PM -0500 Adam Williams awilliam@mdah.state.ms.us wrote:
Quanah Gibson-Mount wrote:
Are they the same OpenLDAP 2.4 release?
no, roark is openldap 2.4.12, while missioncontrol is 2.4.15.
That's probably why you are seeing errors. I recall this as being a bug in earlier 2.4 releases. Upgrade.
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
openldap-software@openldap.org