--On Tuesday, August 16, 2011 3:51 PM -0700 David Engeset davidke@whidbey.net wrote:
Below is how I generally configure BDB and OpenLDAP for normal
operational use. I configured BDB with no parameters for all versions from 4.2 to 5.1, so I ran:
../dist/configure make&& make install
For building BDB on *nix systems, you should specify:
--enable-posixmutexes --with-mutex=POSIX/pthreads
For the debugging I did the following for BDB:
env CFLAGS=-O2 ../dist/configure --enable-debug
You need to use "-g" as I stated previously.
CFLAGS="-g -O2 -fPIC" are the flags I pass to BDB. What we are looking for is the gcc debugging symbols. The --enable-debug flag looks like it does some internal debugging stuff in the BDB code which isn't really what we're asking for.
make&& make install
For OpenLDAP:
export LD_LIBRARY_PATH="/usr/local/BerkeleyDB.4.8/lib" env CPPFLAGS="-I/usr/local/BerkeleyDB.4.8/include" CFLAGS="-g -O0" \ LDFLAGS="-L/usr/local/BerkeleyDB.4.8/lib" ./configure --enable-wrappers \ --enable-crypt --with-cyrus-sasl --with-tls --enable-debug make depend&& make&& make install STRIP=''
This looks good.
Can you please re-generate your stack trace with a correctly build BDB behind openldap?
Thanks!
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
On 8/16/11 4:32 PM, Quanah Gibson-Mount wrote:
--On Tuesday, August 16, 2011 3:51 PM -0700 David Engeset davidke@whidbey.net wrote:
Below is how I generally configure BDB and OpenLDAP for normal
operational use. I configured BDB with no parameters for all versions from 4.2 to 5.1, so I ran:
../dist/configure make&& make install
For building BDB on *nix systems, you should specify:
--enable-posixmutexes --with-mutex=POSIX/pthreads
For the debugging I did the following for BDB:
env CFLAGS=-O2 ../dist/configure --enable-debug
You need to use "-g" as I stated previously.
CFLAGS="-g -O2 -fPIC" are the flags I pass to BDB. What we are looking for is the gcc debugging symbols. The --enable-debug flag looks like it does some internal debugging stuff in the BDB code which isn't really what we're asking for.
make&& make install
For OpenLDAP:
export LD_LIBRARY_PATH="/usr/local/BerkeleyDB.4.8/lib" env CPPFLAGS="-I/usr/local/BerkeleyDB.4.8/include" CFLAGS="-g -O0" \ LDFLAGS="-L/usr/local/BerkeleyDB.4.8/lib" ./configure --enable-wrappers \ --enable-crypt --with-cyrus-sasl --with-tls --enable-debug make depend&& make&& make install STRIP=''
This looks good.
Can you please re-generate your stack trace with a correctly build BDB behind openldap?
Thanks!
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc.
Zimbra :: the leader in open source messaging and collaboration
Quanah, Attached are the two new debug output files and below is what I used to compile BDB for debug.
env CFPLAGS="-g -O2 -fPIC" ../dist/configure --enable-posixmutexes --with-mutex=POSIX/pthreads make && make install
Thank you,
David Engeset wrote:
Quanah, Attached are the two new debug output files and below is what I used to compile BDB for debug.
env CFPLAGS="-g -O2 -fPIC" ../dist/configure --enable-posixmutexes --with-mutex=POSIX/pthreads make&& make install
Here's the relevant info:
This locker is holding a write lock that the other threads are blocked waiting on:
8000542d dd= 6 locks held 1 write locks 1 pid/thread 1502/2366790512 8000542d WRITE 1 HELD dn2id.bdb page 4
This thread is using both a reader transaction and a write transaction (which is perfectly fine) 80005666 dd= 0 locks held 2 write locks 0 pid/thread 1502/2483026800 80005666 READ 1 HELD 0x2c1bc len: 5 data: 0x0300000000 80005666 READ 1 HELD 0x29f4c len: 5 data: 0x020x52000000 80005667 dd= 0 locks held 5 write locks 3 pid/thread 1502/2483026800 80005667 READ 1 WAIT dn2id.bdb page 4 80005667 WRITE 2 HELD dn2id.bdb page 2 80005667 READ 2 HELD dn2id.bdb page 2 80005667 WRITE 2 HELD dn2id.bdb page 1300 80005667 READ 1 HELD dn2id.bdb page 1300 80005667 WRITE 4 HELD dn2id.bdb page 1301
This thread is only waiting to read the locked page: 80004e6f dd=19 locks held 0 write locks 0 pid/thread 1502/2437909360 80004e6f READ 1 WAIT dn2id.bdb page 4
Thread 2366790512 is 0x8d125b70 2483026800 is 0x93fffb70 2437909360 is 0x914f8b70
In your gdb output we see that 0x914f8b70 is LWP 2306, waiting in a search. Thread 0x93fffb70 is LWP 1506, waiting in a delete. Thread 0x8d125b70 is LWP 2457, completely idle.
I.e., the thread that owns the offending write lock is not executing any operation at all. It's important to note that back-bdb only takes write locks inside of a transaction, and transactions either commit or abort. In either case, all of their locks are supposed to be released at the end. It is impossible for slapd code to leak write locks like this. (In fact, the slapd code itself can never lock an individual DB page, as is being done here. Those locks can only be taken by the actual BDB library code.) As such, it appears to be a bug in the BDB library you're using.
Are you sure that OpenLDAP was built against the BDB library you've built, as opposed to some version that was already installed on your system?
On 8/18/11 3:11 PM, Howard Chu wrote:
David Engeset wrote:
Quanah, Attached are the two new debug output files and below is what I used to compile BDB for debug.
env CFPLAGS="-g -O2 -fPIC" ../dist/configure --enable-posixmutexes --with-mutex=POSIX/pthreads make&& make install
Here's the relevant info:
This locker is holding a write lock that the other threads are blocked waiting on:
8000542d dd= 6 locks held 1 write locks 1 pid/thread 1502/2366790512 8000542d WRITE 1 HELD dn2id.bdb page 4
This thread is using both a reader transaction and a write transaction (which is perfectly fine) 80005666 dd= 0 locks held 2 write locks 0 pid/thread 1502/2483026800 80005666 READ 1 HELD 0x2c1bc len: 5 data: 0x0300000000 80005666 READ 1 HELD 0x29f4c len: 5 data: 0x020x52000000 80005667 dd= 0 locks held 5 write locks 3 pid/thread 1502/2483026800 80005667 READ 1 WAIT dn2id.bdb page 4 80005667 WRITE 2 HELD dn2id.bdb page 2 80005667 READ 2 HELD dn2id.bdb page 2 80005667 WRITE 2 HELD dn2id.bdb page 1300 80005667 READ 1 HELD dn2id.bdb page 1300 80005667 WRITE 4 HELD dn2id.bdb page 1301
This thread is only waiting to read the locked page: 80004e6f dd=19 locks held 0 write locks 0 pid/thread 1502/2437909360 80004e6f READ 1 WAIT dn2id.bdb page 4
Thread 2366790512 is 0x8d125b70 2483026800 is 0x93fffb70 2437909360 is 0x914f8b70
In your gdb output we see that 0x914f8b70 is LWP 2306, waiting in a search. Thread 0x93fffb70 is LWP 1506, waiting in a delete. Thread 0x8d125b70 is LWP 2457, completely idle.
I.e., the thread that owns the offending write lock is not executing any operation at all. It's important to note that back-bdb only takes write locks inside of a transaction, and transactions either commit or abort. In either case, all of their locks are supposed to be released at the end. It is impossible for slapd code to leak write locks like this. (In fact, the slapd code itself can never lock an individual DB page, as is being done here. Those locks can only be taken by the actual BDB library code.) As such, it appears to be a bug in the BDB library you're using.
Are you sure that OpenLDAP was built against the BDB library you've built, as opposed to some version that was already installed on your system?
Howard, I believe I am compiling against the proper version of BDB, here is what I did to compile OpenLDAP, with the debugging:
export LD_LIBRARY_PATH="/usr/local/BerkeleyDB.4.8/lib" env CPPFLAGS="-I/usr/local/BerkeleyDB.4.8/include" CFLAGS="-g -O0" \ LDFLAGS="-L/usr/local/BerkeleyDB.4.8/lib" ./configure --enable-wrappers \ --enable-crypt --with-cyrus-sasl --with-tls --enable-debug make depend && make && make install STRIP=''
I even have ld.so.conf set with the path to /usr/local/BerkeleyDB.4.8/lib Is there something else that I am missing that will ensure it compiles against the version that I have installed? Thank you,
--On Friday, August 19, 2011 8:07 AM -0700 David Engeset davidke@whidbey.net wrote:
I even have ld.so.conf set with the path to /usr/local/BerkeleyDB.4.8/lib Is there something else that I am missing that will ensure it compiles against the version that I have installed? Thank you,
If you use static modules, ldd slapd. If you use dynamic modules, ldd back_bdb.so or ldd back_hdb.so
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
On 8/19/11 10:19 AM, Quanah Gibson-Mount wrote:
--On Friday, August 19, 2011 8:07 AM -0700 David Engeset davidke@whidbey.net wrote:
I even have ld.so.conf set with the path to /usr/local/BerkeleyDB.4.8/lib Is there something else that I am missing that will ensure it compiles against the version that I have installed? Thank you,
If you use static modules, ldd slapd. If you use dynamic modules, ldd back_bdb.so or ldd back_hdb.so
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc.
Zimbra :: the leader in open source messaging and collaboration
Here is the results of running ldd slapd, I use static modules: ldd /usr/local/libexec/slapd linux-gate.so.1 => (0x00efe000) libuuid.so.1 => /lib/libuuid.so.1 (0x48526000) libdb-4.8.so => /usr/local/BerkeleyDB.4.8/lib/libdb-4.8.so (0x00110000) libpthread.so.0 => /lib/libpthread.so.0 (0x477a1000) libsasl2.so.2 => /usr/lib/libsasl2.so.2 (0x48f0a000) libssl.so.10 => /usr/lib/libssl.so.10 (0x48ea9000) libcrypto.so.10 => /lib/libcrypto.so.10 (0x48b6a000) libcrypt.so.1 => /lib/libcrypt.so.1 (0x48341000) libresolv.so.2 => /lib/libresolv.so.2 (0x4797a000) libwrap.so.0 => /lib/libwrap.so.0 (0x46d5d000) libc.so.6 => /lib/libc.so.6 (0x475fe000) /lib/ld-linux.so.2 (0x475da000) libdl.so.2 => /lib/libdl.so.2 (0x477be000) libgssapi_krb5.so.2 => /lib/libgssapi_krb5.so.2 (0x48dd5000) libkrb5.so.3 => /lib/libkrb5.so.3 (0x48cf9000) libcom_err.so.2 => /lib/libcom_err.so.2 (0x48520000) libk5crypto.so.3 => /lib/libk5crypto.so.3 (0x48e18000) libz.so.1 => /lib/libz.so.1 (0x477fd000) libfreebl3.so => /lib/libfreebl3.so (0x47f97000) libnsl.so.1 => /lib/libnsl.so.1 (0x4914f000) libkrb5support.so.0 => /lib/libkrb5support.so.0 (0x48e45000) libkeyutils.so.1 => /lib/libkeyutils.so.1 (0x48cf4000) libselinux.so.1 => /lib/libselinux.so.1 (0x47834000)
So based upon this, if I am reading it correctly, I am linking to the BDB version 4.8 that I installed from source which installs to /usr/local/BerkeleyDB.4.8/
openldap-technical@openldap.org