https://bugs.openldap.org/show_bug.cgi?id=9952
Issue ID: 9952 Summary: Crash on exit with OpenSSL 3 Product: OpenLDAP Version: 2.6.2 Hardware: All OS: Linux Status: UNCONFIRMED Keywords: needs_review Severity: normal Priority: --- Component: libraries Assignee: bugs@openldap.org Reporter: artur.zaprzala@gmail.com Target Milestone: ---
A program using libldap will crash on exit after using SSL connection.
How to reproduce on CentOS 9: Uncomment the following lines in /etc/pki/tls/openssl.cnf: [provider_sect] legacy = legacy_sect [legacy_sect] activate = 1
Run the command (you must enter a valid LDAP server address): python3 -c "import ldap; ldap.initialize('ldaps://<LDAP SERVER ADDRESS>').whoami_s()"
Another example (no server required): python3 -c "import ctypes; ctypes.CDLL('libldap.so.2').ldap_pvt_tls_init_def_ctx(0)"
Results: Segmentation fault (core dumped)
Backtrace from gdb: Program received signal SIGSEGV, Segmentation fault. 0 ___pthread_rwlock_rdlock (rwlock=0x0) at pthread_rwlock_rdlock.c:27 1 0x00007ffff7c92f3d in CRYPTO_THREAD_read_lock (lock=<optimized out>) at crypto/threads_pthread.c:85 2 0x00007ffff7c8b126 in ossl_lib_ctx_get_data (ctx=0x7ffff7eff540 <default_context_int.lto_priv>, index=1, meth=0x7ffff7eb8a00 <provider_store_method.lto_priv>) at crypto/context.c:398 3 0x00007ffff7c98bea in get_provider_store (libctx=<optimized out>) at crypto/provider_core.c:334 4 ossl_provider_deregister_child_cb (handle=0x5555555ed620) at crypto/provider_core.c:1752 5 0x00007ffff7c8bf2f in ossl_provider_deinit_child (ctx=0x5555555d2650) at crypto/provider_child.c:279 6 OSSL_LIB_CTX_free (ctx=0x5555555d2650) at crypto/context.c:283 7 OSSL_LIB_CTX_free (ctx=0x5555555d2650) at crypto/context.c:276 8 0x00007ffff7634af6 in legacy_teardown (provctx=0x5555555ee9f0) at providers/legacyprov.c:168 9 0x00007ffff7c9901b in ossl_provider_teardown (prov=0x5555555ed620) at crypto/provider_core.c:1477 10 ossl_provider_free (prov=0x5555555ed620) at crypto/provider_core.c:683 11 0x00007ffff7c63956 in ossl_provider_free (prov=<optimized out>) at crypto/provider_core.c:668 12 evp_cipher_free_int (cipher=0x555555916c10) at crypto/evp/evp_enc.c:1632 13 EVP_CIPHER_free (cipher=0x555555916c10) at crypto/evp/evp_enc.c:1647 14 0x00007ffff7a6bc1d in ssl_evp_cipher_free (cipher=0x555555916c10) at ssl/ssl_lib.c:5925 15 ssl_evp_cipher_free (cipher=0x555555916c10) at ssl/ssl_lib.c:5915 16 SSL_CTX_free (a=0x555555ec1020) at ssl/ssl_lib.c:3455 17 SSL_CTX_free (a=0x555555ec1020) at ssl/ssl_lib.c:3392 18 0x00007fffe95edb89 in ldap_int_tls_destroy (lo=0x7fffe9616000 <ldap_int_global_options>) at /usr/src/debug/openldap-2.6.2-1.el9_0.x86_64/openldap-2.6.2/libraries/libldap/tls2.c:104 19 0x00007ffff7fd100b in _dl_fini () at dl-fini.c:138 20 0x00007ffff7873475 in __run_exit_handlers (status=0, listp=0x7ffff7a11658 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:113 21 0x00007ffff78735f0 in __GI_exit (status=<optimized out>) at exit.c:143 22 0x00007ffff785be57 in __libc_start_call_main (main=main@entry=0x55555556aa20 <main>, argc=argc@entry=4, argv=argv@entry=0x7fffffffe2b8) at ../sysdeps/nptl/libc_start_call_main.h:74 23 0x00007ffff785befc in __libc_start_main_impl (main=0x55555556aa20 <main>, argc=4, argv=0x7fffffffe2b8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe2a8) at ../csu/libc-start.c:409 24 0x000055555556b575 in _start ()
The problem is that ldap_int_tls_destroy() is called after the clean up of libssl.
On program exit, at first default_context_int is cleaned up (OPENSSL_cleanup() was registered with atexit()): 0 ossl_lib_ctx_default_deinit () at crypto/context.c:196 1 OPENSSL_cleanup () at crypto/init.c:424 2 OPENSSL_cleanup () at crypto/init.c:338 3 0x00007ffff7873475 in __run_exit_handlers (status=0, listp=0x7ffff7a11658 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:113 4 0x00007ffff78735f0 in __GI_exit (status=<optimized out>) at exit.c:143 5 0x00007ffff785be57 in __libc_start_call_main (main=main@entry=0x55555556aa20 <main>, argc=argc@entry=4, argv=argv@entry=0x7fffffffe2c8) at ../sysdeps/nptl/libc_start_call_main.h:74 6 0x00007ffff785befc in __libc_start_main_impl (main=0x55555556aa20 <main>, argc=4, argv=0x7fffffffe2c8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe2b8) at ../csu/libc-start.c:409 7 0x000055555556b575 in _start ()
Then ossl_lib_ctx_get_data() tries to use default_context_int.lock, which is NULL. ldap_int_tls_destroy() is called by ldap_int_destroy_global_options(), registered by "__attribute__ ((destructor))".
It seems that shared library destructors are always called before functions registered with atexit(). A solution may be to modify libraries/libldap/init.c to use atexit() instead of "__attribute__ ((destructor))". atexit() manual page says: "Since glibc 2.2.3, atexit() can be used within a shared library to establish functions that are called when the shared library is unloaded.". Functions registered with atexit() are called in the reverse order of their registration, so libssl must by initialized before libldap. If the order is wrong, libldap should detect it somehow and exit with abort().
https://bugs.openldap.org/show_bug.cgi?id=9952
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Keywords|needs_review | Target Milestone|--- |2.6.5
https://bugs.openldap.org/show_bug.cgi?id=9952
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|2.6.5 |2.5.15 Assignee|bugs@openldap.org |hyc@openldap.org
https://bugs.openldap.org/show_bug.cgi?id=9952
Howard Chu hyc@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Ever confirmed|0 |1 Status|UNCONFIRMED |IN_PROGRESS
--- Comment #1 from Howard Chu hyc@openldap.org --- (In reply to artur.zaprzala from comment #0)
A program using libldap will crash on exit after using SSL connection.
The problem is that ldap_int_tls_destroy() is called after the clean up of libssl.
On program exit, at first default_context_int is cleaned up (OPENSSL_cleanup() was registered with atexit()): 0 ossl_lib_ctx_default_deinit () at crypto/context.c:196 1 OPENSSL_cleanup () at crypto/init.c:424 2 OPENSSL_cleanup () at crypto/init.c:338 3 0x00007ffff7873475 in __run_exit_handlers (status=0, listp=0x7ffff7a11658 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:113 4 0x00007ffff78735f0 in __GI_exit (status=<optimized out>) at exit.c:143 5 0x00007ffff785be57 in __libc_start_call_main (main=main@entry=0x55555556aa20 <main>, argc=argc@entry=4, argv=argv@entry=0x7fffffffe2c8) at ../sysdeps/nptl/libc_start_call_main.h:74 6 0x00007ffff785befc in __libc_start_main_impl (main=0x55555556aa20 <main>, argc=4, argv=0x7fffffffe2c8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe2b8) at ../csu/libc-start.c:409 7 0x000055555556b575 in _start ()
Then ossl_lib_ctx_get_data() tries to use default_context_int.lock, which is NULL. ldap_int_tls_destroy() is called by ldap_int_destroy_global_options(), registered by "__attribute__ ((destructor))".
It seems that shared library destructors are always called before functions registered with atexit().
Sounds like your description is backwards: the libldap destructor got called after OpenSSL's atexit() handler.
A solution may be to modify libraries/libldap/init.c to use atexit() instead of "__attribute__ ((destructor))". atexit() manual page says: "Since glibc 2.2.3, atexit() can be used within a shared library to establish functions that are called when the shared library is unloaded.".
Functions registered with atexit() are called in the reverse order of their registration, so libssl must by initialized before libldap. If the order is wrong, libldap should detect it somehow and exit with abort().
This sounds OK, since libldap will call OPENSSL_init_ssl() first, we can guarantee that their atexit() invocation happens first.
https://bugs.openldap.org/show_bug.cgi?id=9952
--- Comment #2 from Howard Chu hyc@openldap.org --- https://git.openldap.org/openldap/openldap/-/merge_requests/624
Please test. I was unable to reproduce the crash on my Ubuntu 22.04 machine.
https://bugs.openldap.org/show_bug.cgi?id=9952
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|2.5.15 |2.5.16
https://bugs.openldap.org/show_bug.cgi?id=9952
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|2.5.16 |2.5.17
https://bugs.openldap.org/show_bug.cgi?id=9952
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|IN_PROGRESS |RESOLVED Resolution|--- |FIXED
--- Comment #3 from Quanah Gibson-Mount quanah@openldap.org --- head:
• 337455eb by Howard Chu at 2023-05-31T16:04:15+00:00 ITS#9952 libldap: use atexit for TLS teardown
RE26:
• 375d21a9 by Howard Chu at 2023-09-26T17:22:07+00:00 ITS#9952 libldap: use atexit for TLS teardown
RE25:
• 5f87a709 by Howard Chu at 2023-09-26T17:23:05+00:00 ITS#9952 libldap: use atexit for TLS teardown
https://bugs.openldap.org/show_bug.cgi?id=9952
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |VERIFIED
https://bugs.openldap.org/show_bug.cgi?id=9952
--- Comment #4 from philip.miloslavsky@gmail.com --- https://bugs.openldap.org/show_bug.cgi?id=10176 See this bug for atexit causing issues
https://bugs.openldap.org/show_bug.cgi?id=9952
Howard Chu hyc@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://bugs.openldap.org/s | |how_bug.cgi?id=10176
https://bugs.openldap.org/show_bug.cgi?id=9952
--- Comment #5 from Howard Chu hyc@openldap.org --- I was finally able to reproduce this crash on a Centos 9 container. The real bug is that OpenSSL's thread locking functions don't check for the passed in lock being NULL. https://github.com/openssl/openssl/pull/23616 will fix that bug.
https://bugs.openldap.org/show_bug.cgi?id=9952
Howard Chu hyc@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://github.com/openssl/ | |openssl/pull/23616
https://bugs.openldap.org/show_bug.cgi?id=9952
Howard Chu hyc@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|FIXED |--- Status|VERIFIED |CONFIRMED
--- Comment #6 from Howard Chu hyc@openldap.org --- Re-fixed in git master a5953812f0c03e802e61109ae18e8fed5f3f2df8 Previous fix reverted in 5e13ef87a94491f9339dbca709db29e76741f1a9
https://bugs.openldap.org/show_bug.cgi?id=9952
Howard Chu hyc@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |TEST Status|CONFIRMED |RESOLVED
https://bugs.openldap.org/show_bug.cgi?id=9952
--- Comment #7 from Dalton Durst github@daltondur.st --- The reverted version of this fix was included in 2.5.17, it may be desirable to backport the revert and new fix to the 2_5 branch.
https://bugs.openldap.org/show_bug.cgi?id=9952
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|2.5.17 |2.5.18
--- Comment #8 from Quanah Gibson-Mount quanah@openldap.org --- This didn't make it into 2.5.17 as the normal bug process was not followed and I missed it.
https://bugs.openldap.org/show_bug.cgi?id=9952
--- Comment #9 from Quanah Gibson-Mount quanah@openldap.org --- It also did not go into 2.6.7 either.
https://bugs.openldap.org/show_bug.cgi?id=9952
--- Comment #10 from Quanah Gibson-Mount quanah@openldap.org --- Final fixes were after the 2.5.17/2.6.7 releases.
head:
• 5e13ef87 by Howard Chu at 2024-02-13T17:29:05+00:00 Revert "ITS#9952 libldap: use atexit for TLS teardown"
• a5953812 by Howard Chu at 2024-02-18T10:57:07+00:00 ITS#9952 TLS/OpenSSL: disable use of atexit()
https://bugs.openldap.org/show_bug.cgi?id=9952
--- Comment #11 from Quanah Gibson-Mount quanah@openldap.org --- RE26:
• 5e598b43 by Howard Chu at 2024-03-26T16:33:50+00:00 Revert "ITS#9952 libldap: use atexit for TLS teardown"
• e08b80e8 by Howard Chu at 2024-03-26T16:33:55+00:00 ITS#9952 TLS/OpenSSL: disable use of atexit()
RE25:
• dcbd0113 by Howard Chu at 2024-03-26T16:32:23+00:00 Revert "ITS#9952 libldap: use atexit for TLS teardown"
• 6dc030a8 by Howard Chu at 2024-03-26T16:32:29+00:00 ITS#9952 TLS/OpenSSL: disable use of atexit()
https://bugs.openldap.org/show_bug.cgi?id=9952
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|TEST |FIXED
https://bugs.openldap.org/show_bug.cgi?id=9952
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://bugs.openldap.org/s | |how_bug.cgi?id=10209
https://bugs.openldap.org/show_bug.cgi?id=9952
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |VERIFIED
https://bugs.openldap.org/show_bug.cgi?id=9952
Matthew Hardin mhardin@symas.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|VERIFIED |CONFIRMED Resolution|FIXED |---
--- Comment #12 from Matthew Hardin mhardin@symas.com --- This bug is still in evidence as of 2.6.8 under Windows/MSYS2/UCRT64. slapd and slaptools all dump core on exit. Applying the below patch, which is supposed to be obsolete, fixes the problem, so there's still more work to do here.
Subject: [PATCH] ITS#9952 libldap: use atexit for TLS teardown
--- libraries/libldap/init.c | 3 --- libraries/libldap/tls2.c | 14 +++++++++++++- 2 files changed, 13 insertions(+), 4 deletions(-)
diff --git a/libraries/libldap/init.c b/libraries/libldap/init.c index 3a81790dcf..b9915533bd 100644 --- a/libraries/libldap/init.c +++ b/libraries/libldap/init.c @@ -544,9 +544,6 @@ ldap_int_destroy_global_options(void) gopts->ldo_def_sasl_authcid = NULL; } #endif -#ifdef HAVE_TLS - ldap_int_tls_destroy( gopts ); -#endif }
/* diff --git a/libraries/libldap/tls2.c b/libraries/libldap/tls2.c index dff845bc10..4bfc346c70 100644 --- a/libraries/libldap/tls2.c +++ b/libraries/libldap/tls2.c @@ -160,6 +160,14 @@ ldap_pvt_tls_destroy( void ) tls_imp->ti_tls_destroy(); }
+static void +ldap_exit_tls_destroy( void ) +{ + struct ldapoptions *lo = LDAP_INT_GLOBAL_OPT(); + + ldap_int_tls_destroy( lo ); +} + /* * Initialize a particular TLS implementation. * Called once per implementation. @@ -168,6 +176,7 @@ static int tls_init(tls_impl *impl, int do_threads ) { static int tls_initialized = 0; + int rc;
if ( !tls_initialized++ ) { #ifdef LDAP_R_COMPILE @@ -183,7 +192,10 @@ tls_init(tls_impl *impl, int do_threads ) #endif }
- return impl->ti_tls_init(); + rc = impl->ti_tls_init(); + + atexit( ldap_exit_tls_destroy ); + return rc; }
https://bugs.openldap.org/show_bug.cgi?id=9952
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |simon.pichugin@gmail.com
--- Comment #13 from Quanah Gibson-Mount quanah@openldap.org --- *** Issue 10255 has been marked as a duplicate of this issue. ***
https://bugs.openldap.org/show_bug.cgi?id=9952
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- URL| |https://github.com/openssl/ | |openssl/issues/25294
--- Comment #14 from Quanah Gibson-Mount quanah@openldap.org --- From ITS#10255:
https://github.com/openssl/openssl/issues/25294
https://bugs.openldap.org/show_bug.cgi?id=9952
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|2.5.18 |2.5.19
https://bugs.openldap.org/show_bug.cgi?id=9952
Howard Chu hyc@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://bugs.openldap.org/s | |how_bug.cgi?id=9303
https://bugs.openldap.org/show_bug.cgi?id=9952
Howard Chu hyc@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |FIXED Status|CONFIRMED |RESOLVED
--- Comment #15 from Howard Chu hyc@openldap.org --- The current code in git is basically a no-op for OpenSSL 3 teardown, and has been for years. Since commit 01cbb7f4c6326f053d3daf317d0cc0b26a1b02fe in 2017. We only call any cleanup routines in OpenSSL 1.1 and older. The crashes in OpenSSL 3 were confirmed to have been fixed already. I see no reason to make any more changes for OpenSSL 1.1 since that's EOL.