Full_Name: Soichi Hayashi Version: 2.4.22 OS: RHEL 5.7 URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (99.137.202.79)
When a large number of ldap update requests collides with large number of ldap search queries, mutex on BDB linked to OpenLDAP deadlocks.
Following is the gdb output when this deadlocks occurs.
(gdb) info threads 13 Thread 0x40a73940 (LWP 18767) 0x00000037aa4d48a8 in epoll_wait () from /lib64/libc.so.6 12 Thread 0x41cbe940 (LWP 18768) 0x00000037aa4cd722 in select () from /lib64/libc.so.6 11 Thread 0x424bf940 (LWP 18769) 0x00000037aac0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 10 Thread 0x42cc0940 (LWP 19203) 0x00000037aac0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 9 Thread 0x41274940 (LWP 19212) 0x00000037aac0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 8 Thread 0x434c1940 (LWP 19213) 0x00000037aac0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 7 Thread 0x43cc2940 (LWP 19214) 0x00000037aac0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 6 Thread 0x444c3940 (LWP 19215) 0x00000037aac0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 5 Thread 0x44cc4940 (LWP 19216) 0x00000037aac0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 4 Thread 0x454c5940 (LWP 19217) 0x00000037aac0a256 in pthread_rwlock_rdlock () from /lib64/libpthread.so.0 3 Thread 0x45cc6940 (LWP 19218) 0x00000037aac0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 2 Thread 0x464c7940 (LWP 19219) 0x00000037aac0a4c0 in pthread_rwlock_wrlock () from /lib64/libpthread.so.0 * 1 Thread 0x2ad9964071e0 (LWP 18752) 0x00000037aac07b35 in pthread_join () from /lib64/libpthread.so.0
(gdb) thread 2 [Switching to thread 2 (Thread 0x464c7940 (LWP 19219))]#0 0x00000037aac0a4c0 in pthread_rwlock_wrlock () from /lib64/libpthread.so.0 (gdb) bt #0 0x00000037aac0a4c0 in pthread_rwlock_wrlock () from /lib64/libpthread.so.0 #1 0x0000003f22c2b963 in __db_pthread_mutex_lock_openldap_slapd24_mdv () from /usr/lib64/libslapd24_db-4.8.so #2 0x0000003f22d1e2dc in __memp_fget_openldap_slapd24_mdv () from /usr/lib64/libslapd24_db-4.8.so #3 0x0000003f22c46a8f in __bam_search_openldap_slapd24_mdv () from /usr/lib64/libslapd24_db-4.8.so #4 0x0000003f22c34649 in ?? () from /usr/lib64/libslapd24_db-4.8.so #5 0x0000003f22c3534b in ?? () from /usr/lib64/libslapd24_db-4.8.so #6 0x0000003f22cd4bbf in __dbc_iput_openldap_slapd24_mdv () from /usr/lib64/libslapd24_db-4.8.so #7 0x0000003f22cd536f in __dbc_put_openldap_slapd24_mdv () from /usr/lib64/libslapd24_db-4.8.so #8 0x0000003f22cdca0f in __dbc_put_pp_openldap_slapd24_mdv () from /usr/lib64/libslapd24_db-4.8.so #9 0x00000000004c9d7e in bdb_idl_insert_key () #10 0x00000000004cbcb2 in bdb_key_change ()
(gdb) thread 4 [Switching to thread 4 (Thread 0x454c5940 (LWP 19217))]#0 0x00000037aac0a256 in pthread_rwlock_rdlock () from /lib64/libpthread.so.0 (gdb) bt #0 0x00000037aac0a256 in pthread_rwlock_rdlock () from /lib64/libpthread.so.0 #1 0x0000003f22c2b754 in __db_pthread_mutex_readlock_openldap_slapd24_mdv () from /usr/lib64/libslapd24_db-4.8.so #2 0x0000003f22d1e5fc in __memp_fget_openldap_slapd24_mdv () from /usr/lib64/libslapd24_db-4.8.so #3 0x0000003f22cd3df9 in __dbc_iget_openldap_slapd24_mdv () from /usr/lib64/libslapd24_db-4.8.so #4 0x0000003f22cdd43a in __dbc_get_pp_openldap_slapd24_mdv () from /usr/lib64/libslapd24_db-4.8.so #5 0x00000000004ca6d0 in bdb_idl_fetch_key () #6 0x00000000004cbd75 in bdb_key_read ()
In above case, thread 2 and thread 4 are deadlocked. The deadlock occurs within 2 - 8 hours after we start testing LDAP server on our BDII services.
Meanwhile, OpenLDAP version 2.3 does not cause this deadlock, but it causes a gradual CPU usage increase until the performance degrades to the point where we need to restart the server. Is there anyway you could fix the deadlock issue on v2.4, or CPU usage issue on v2.3?
Thanks, Soichi Hayashi