ckowalczyk@suse.com wrote:
Full_Name: Chris Kowalczyk Version: 2.4.41 OS: SLE12 URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (2620:113:80c0:5::2222)
Hello, I have been experiencing segfaults in mdb_env_reader_dest happening randomly, only from time to time, during shutting off openldap.
After some investigation I narrowed the problem down to mdb_env_reader_dest destructor function being called after its object had been deleted while powering off the application. mdb_env_reader_dest was called automatically, because it was agginned to the object in pthread_key_create call: https://github.com/openldap/openldap/blob/master/libraries/liblmdb/mdb.c#L47...
The problem was not happening often, and always during powering off, so no data was lost etc. However, due to its annoyance, I prepared a simple patch, which prevents mdb_env_reader_dest function being called if the object was already deleted. The patch works fine, I haven't seen the problem since it was introduced.
I am using OpenLDAP 2.4.41, but the code seems to be the same in other versions as well. Would you like me to create a pull request for it?
We've discussed this extensively in private emails. There is no documented reason why the provided patch would have the claimed effect.
Deferring this until a valid explanation for why this patch is correct can be shown.
========================== === The backtrace: ==========================
Program terminated with signal 11, Segmentation fault. #0 mdb_env_reader_dest (ptr=0x7f4a9c71c080) at ../../../libraries/liblmdb/mdb.c:4050 4050 reader->mr_pid = 0;
#thread apply all bt
Thread 2 (LWP 20920): #0 0x00007f4a9ac31d9d in close () at ../sysdeps/unix/syscall-template.S:84 #1 0x000055902eaeae8c in mdb_env_close1 (env=env@entry=0x55902fce50e0) at ../../../libraries/liblmdb/mdb.c:4756 #2 0x000055902eaeb578 in mdb_env_close1 (env=0x55902fce50e0) at ../../../libraries/liblmdb/mdb.c:4779 #3 mdb_env_close (env=0x55902fce50e0) at ../../../libraries/liblmdb/mdb.c:4778 #4 0x000055902ea69114 in mdb_db_close (be=0x55902faeb300, cr=<optimized out>) at init.c:340 #5 0x000055902ea3815b in over_db_close (be=0x55902faeb300, cr=0x0) at backover.c:182 #6 0x000055902e9d8c0b in backend_shutdown (be=0x55902faeb300, be@entry=0x0) at backend.c:376 #7 0x000055902e9fa258 in slap_shutdown (be=be@entry=0x0) at init.c:232 #8 0x000055902e9af12f in main (argc=<optimized out>, argv=<optimized out>) at main.c:1027
Thread 1 (LWP 20928): #0 mdb_env_reader_dest (ptr=0x7f4a9c71c080) at ../../../libraries/liblmdb/mdb.c:4050 #1 0x00007f4a9ac2a512 in __nptl_deallocate_tsd () at pthread_create.c:176 #2 0x00007f4a9ac2a767 in start_thread (arg=0x7f4a97acc700) at pthread_create.c:347 #3 0x00007f4a9a968aad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
# frame 0 #0 mdb_env_reader_dest (ptr=0x7f4a9c71c080) at ../../../libraries/liblmdb/mdb.c:4050 4050 reader->mr_pid = 0; #p reader $1 = (MDB_reader *) 0x7f4a9c71c080 #p *reader Cannot access memory at address 0x7f4a9c71c080
========================== === The proposed patch: ==========================
diff --git a/libraries/liblmdb/mdb.c b/libraries/liblmdb/mdb.c index 6bdf3151d..56212151b 100644 --- a/libraries/liblmdb/mdb.c +++ b/libraries/liblmdb/mdb.c @@ -4692,6 +4692,11 @@ mdb_env_close0(MDB_env *env, int excl)
if (env->me_flags & MDB_ENV_TXKEY) { pthread_key_delete(env->me_txkey);
// No need to call desctructor anymore, as all pid
// values are cleared below.
env->me_txkey = NULL;
#ifdef _WIN32 /* Delete our key from the global list */ for (i=0; i<mdb_tls_nkeys; i++)