https://bugs.openldap.org/show_bug.cgi?id=9858
Issue ID: 9858 Summary: cn=config replication refresh breaks mdb database Product: OpenLDAP Version: 2.6.2 Hardware: All OS: All Status: UNCONFIRMED Keywords: needs_review Severity: normal Priority: --- Component: slapd Assignee: bugs@openldap.org Reporter: quanah@openldap.org Target Milestone: ---
working in an environment with the following layout:
Config node -> This node contains the cn=config databases for consumers in one tree and providers in another tree.
provider nodes -> Pull their configuration from the config node via standard syncrepl
consumer nodes -> Pull their configuration from the config node via standard syncrepl
This worked fine with OpenLDAP 2.4.
With 2.6, using the same cn=config configuration, slapd will segfault if the cn=config db is put through a REFRESH sequence. I've ruled out syncprov as an issue as I removed the syncprov configuration from the consumer nodes entirely and it still occurs (I was thinking perhaps it was the syncprov checkpoint at shutdown). At this point I can reproduce this trivially 100% of the time simply by having a node REFRESH cn=config.
Once the REFRESH has occurred, slapd will continue running. However the next time it is stopped/started it will SEGV on startup. It is usually possible to use slapcat to extract the MDB database and rebuild it, at which point slapd will start again. A couple of times I was only able to extract a portion of the database using slapcat (70MB of 650MB).
https://bugs.openldap.org/show_bug.cgi?id=9858
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|--- |2.6.3
https://bugs.openldap.org/show_bug.cgi?id=9858
--- Comment #1 from Quanah Gibson-Mount quanah@openldap.org --- These threads active at startup:
Thread 7 (Thread 0x7ffae9ffb700 (LWP 11624)): #0 0x00007ffff6ece54d in __lll_lock_wait () from /usr/lib64/libpthread.so.0 #1 0x00007ffff6ec9e9b in _L_lock_883 () from /usr/lib64/libpthread.so.0 #2 0x00007ffff6ec9d68 in pthread_mutex_lock () from /usr/lib64/libpthread.so.0 #3 0x00007ffff7bc2a15 in ldap_pvt_thread_mutex_lock (mutex=<optimized out>) at thr_posix.c:313 #4 0x000055555562343b in monitor_cache_add (mi=mi@entry=0x5555558b07e0 <monitor_info>, e=e@entry=0x555555963328, parent=0x555555962388) at cache.c:102 #5 0x000055555561af7c in monitor_back_register_entry (e=0x555555963238, cb=0x7ffadc001230, mss=0x0, flags=0) at init.c:551 #6 0x00005555556015eb in syncrepl_monitor_add (si=0x555555a89300) at syncrepl.c:7191 #7 do_syncrepl (ctx=0x7ffae9ffaa80, arg=0x555555a8a770) at syncrepl.c:2051 #8 0x00007ffff7bc20f3 in ldap_int_thread_pool_wrapper (xpool=0x55555593a8c0) at tpool.c:1053 #9 0x00007ffff6ec7ea5 in start_thread () from /usr/lib64/libpthread.so.0 #10 0x00007ffff6bf0b0d in clone () from /usr/lib64/libc.so.6
Thread 6 (Thread 0x7ffaea7fc700 (LWP 11623)): #0 mdb_online_index (ctx=<optimized out>, arg=0x555555e91fd0) at config.c:225 #1 0x00007ffff7bc20f3 in ldap_int_thread_pool_wrapper (xpool=0x55555593a8c0) at tpool.c:1053 #2 0x00007ffff6ec7ea5 in start_thread () from /usr/lib64/libpthread.so.0 #3 0x00007ffff6bf0b0d in clone () from /usr/lib64/libc.so.6
Thread 5 (Thread 0x7ffaeaffd700 (LWP 11622)): #0 0x00007ffff6ece5ea in __lll_unlock_wake () from /usr/lib64/libpthread.so.0 #1 0x00007ffff6ecaf9e in _L_unlock_738 () from /usr/lib64/libpthread.so.0 #2 0x00007ffff6ecaf10 in pthread_mutex_unlock () from /usr/lib64/libpthread.so.0 #3 0x00007ffff7bc2a35 in ldap_pvt_thread_mutex_unlock (mutex=<optimized out>) at thr_posix.c:325 #4 0x000055555559d490 in attr_alloc (ad=ad@entry=0x555555936fa0) at attr.c:88 #5 0x000055555559e118 in attr_merge_one (e=e@entry=0x555555963288, desc=desc@entry=0x555555936fa0, val=val@entry=0x7ffaeaffbe00, nval=nval@entry=0x7ffaeaffbd70) at attr.c:578 #6 0x000055555559e1da in attr_merge_normalize_one (e=e@entry=0x555555963288, desc=0x555555936fa0, val=val@entry=0x7ffaeaffbe00, memctx=memctx@entry=0x0) at attr.c:633 #7 0x0000555555627145 in monitor_back_entry_stub (pdn=0x7ffaeaffc060, pndn=0x7ffaeaffc020, rdn=<optimized out>, oc=0x55555595fb60, create=0x0, modify=0x0) at entry.c:212 #8 0x000055555560149b in syncrepl_monitor_add (si=0x555555a87c00) at syncrepl.c:7154 #9 do_syncrepl (ctx=0x7ffaeaffca80, arg=0x555555a89040) at syncrepl.c:2051 #10 0x00007ffff7bc20f3 in ldap_int_thread_pool_wrapper (xpool=0x55555593a8c0) at tpool.c:1053 #11 0x00007ffff6ec7ea5 in start_thread () from /usr/lib64/libpthread.so.0 #12 0x00007ffff6bf0b0d in clone () from /usr/lib64/libc.so.6
Thread 4 (Thread 0x7ffaeb7fe700 (LWP 11621)): #0 0x00007ffff6b749c5 in _int_malloc () from /usr/lib64/libc.so.6 #1 0x00007ffff6b7778c in malloc () from /usr/lib64/libc.so.6 #2 0x00007ffff7973155 in ber_memalloc_x (s=<optimized out>, ctx=ctx@entry=0x0) at memory.c:228 #3 0x00007ffff7973492 in ber_dupbv_x (dst=0x7ffae0001810, src=0x7ffae0000920, ctx=ctx@entry=0x0) at memory.c:505 #4 0x00007ffff7973587 in ber_dupbv (dst=<optimized out>, src=<optimized out>) at memory.c:528 #5 0x000055555559d2da in attr_dup2 (tmp=tmp@entry=0x555555d836c0, a=a@entry=0x555555d83670) at attr.c:239 #6 0x000055555559d792 in attrs_dup (a=0x555555d83670) at attr.c:282 #7 0x00005555555a0aab in entry_dup2 (dest=0x5555559631e8, source=source@entry=0x5555559632d8) at entry.c:940 #8 0x00005555555a0b35 in entry_dup (e=e@entry=0x5555559632d8) at entry.c:949 #9 0x000055555561af3a in monitor_back_register_entry (e=0x5555559632d8, cb=0x7ffae0001230, mss=0x0, flags=0) at init.c:530 #10 0x00005555556015eb in syncrepl_monitor_add (si=0x555555a86640) at syncrepl.c:7191 #11 do_syncrepl (ctx=0x7ffaeb7fda80, arg=0x555555a85150) at syncrepl.c:2051 #12 0x00007ffff7bc20f3 in ldap_int_thread_pool_wrapper (xpool=0x55555593a8c0) at tpool.c:1053 #13 0x00007ffff6ec7ea5 in start_thread () from /usr/lib64/libpthread.so.0 #14 0x00007ffff6bf0b0d in clone () from /usr/lib64/libc.so.6
Thread 3 (Thread 0x7ffaebfff700 (LWP 11620)): #0 ldap_initialize (ldp=ldp@entry=0x7ffaebffde10, url=0x5555559e3c60 "ldaps://local.nonprod.eu1.ldap-cp.klarna.net:40000") at open.c:296 #1 0x000055555558a967 in slap_client_connect (ldp=ldp@entry=0x555555a106a8, sb=sb@entry=0x555555a10440) at config.c:1879 #2 0x0000555555601f86 in do_syncrep1 (si=0x555555a10410, op=0x7ffaebffe510) at syncrepl.c:1019 #3 do_syncrepl (ctx=<optimized out>, arg=0x555555a11af0) at syncrepl.c:2155 #4 0x00007ffff7bc20f3 in ldap_int_thread_pool_wrapper (xpool=0x55555593a8c0) at tpool.c:1053 #5 0x00007ffff6ec7ea5 in start_thread () from /usr/lib64/libpthread.so.0 #6 0x00007ffff6bf0b0d in clone () from /usr/lib64/libc.so.6
Thread 2 (Thread 0x7ffaf0faa700 (LWP 11619)): #0 0x00007ffff6bf10e3 in epoll_wait () from /usr/lib64/libc.so.6 #1 0x000055555558edb4 in slapd_daemon_task (ptr=<optimized out>) at daemon.c:2845 #2 0x00007ffff6ec7ea5 in start_thread () from /usr/lib64/libpthread.so.0 #3 0x00007ffff6bf0b0d in clone () from /usr/lib64/libc.so.6
Thread 1 (Thread 0x7ffff7fe0880 (LWP 11613)): #0 0x00007ffff6ec9017 in pthread_join () from /usr/lib64/libpthread.so.0 #1 0x00007ffff7bc2975 in ldap_pvt_thread_join (thread=<optimized out>, thread_return=<optimized out>) at thr_posix.c:214 #2 0x0000555555591afa in slapd_daemon () at daemon.c:3378 #3 0x0000555555575cba in main (argc=<optimized out>, argv=<optimized out>) at main.c:873
https://bugs.openldap.org/show_bug.cgi?id=9858
--- Comment #2 from Quanah Gibson-Mount quanah@openldap.org --- backtrace of thread 6, where the SEGV occurred:
#0 mdb_online_index (ctx=<optimized out>, arg=0x555555e91fd0) at config.c:225 rtask = 0x555555e91fd0 be = 0x7fffffffdf50 mdb = 0x1 conn = {c_conn_state = SLAP_C_INVALID, c_conn_idx = -1, c_sd = 0, c_close_reason = 0x0, c_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, c_sb = 0x0, c_starttime = 0, c_activitytime = 0, c_connid = 18446744073709551615, c_peer_domain = {bv_len = 0, bv_val = 0x55555563be5b ""}, c_peer_name = {bv_len = 0, bv_val = 0x55555563be5b ""}, c_listener = 0x55555589f640 <dummy_list>, c_sasl_bind_mech = {bv_len = 0, bv_val = 0x0}, c_sasl_dn = {bv_len = 0, bv_val = 0x0}, c_sasl_authz_dn = {bv_len = 0, bv_val = 0x0}, c_authz_backend = 0x0, c_authz_cookie = 0x0, c_authz = { sai_method = 0, sai_mech = {bv_len = 0, bv_val = 0x0}, sai_dn = {bv_len = 0, bv_val = 0x0}, sai_ndn = {bv_len = 0, bv_val = 0x0}, sai_ssf = 0, sai_transport_ssf = 0, sai_tls_ssf = 0, sai_sasl_ssf = 0}, c_protocol = 0, c_ops = {stqh_first = 0x0, stqh_last = 0x0}, c_pending_ops = { stqh_first = 0x0, stqh_last = 0x0}, c_write1_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, c_write1_cv = {__data = {__lock = 0, __futex = 0, __total_seq = 0, __wakeup_seq = 0, __woken_seq = 0, __mutex = 0x0, __nwaiters = 0, __broadcast_seq = 0}, __size = '\000' <repeats 47 times>, __align = 0}, c_currentber = 0x0, c_writers = 0, c_writing = 0 '\000', c_sasl_bind_in_progress = 0 '\000', c_writewaiter = 0 '\000', c_is_tls = 0 '\000', c_needs_tls_accept = 0 '\000', c_sasl_layers = 0 '\000', c_sasl_done = 0 '\000', c_sasl_authctx = 0x0, c_sasl_sockctx = 0x0, c_sasl_extra = 0x0, c_sasl_cbind = 0x0, c_sasl_bindop = 0x0, c_txn = 0, c_txn_backend = 0x0, c_txn_ops = {stqh_first = 0x0, stqh_last = 0x0}, c_pagedresults_state = {ps_be = 0x0, ps_size = 0, ps_count = 0, ps_cookie = 0, ps_cookieval = {bv_len = 0, bv_val = 0x0}}, c_n_ops_received = 0, c_n_ops_executing = 0, c_n_ops_pending = 0, c_n_ops_completed = 0, c_n_ops_async = 0, c_n_get = 0, c_n_read = 0, c_n_write = 0, c_extensions = 0x0, c_clientfunc = 0x0, c_clientarg = 0x0, c_send_ldap_result = 0x5555555a58b0 <slap_send_ldap_result>, c_send_search_entry = 0x5555555a71e0 <slap_send_search_entry>, c_send_search_reference = 0x5555555a65d0 <slap_send_search_reference>, c_send_ldap_extended = 0x5555555a61c0 <slap_send_ldap_extended>, c_send_ldap_intermediate = 0x5555555a6440 <slap_send_ldap_intermediate>} opbuf = {ob_op = {o_hdr = 0x7ffaea7fb7b8, o_tag = 0, o_time = 1654123158, o_tincr = 408405, o_tusec = 495749, o_qtime = {tv_sec = 0, tv_usec = 0}, o_bd = 0x7fffffffdf50, o_req_dn = {bv_len = 0, bv_val = 0x0}, o_req_ndn = {bv_len = 0, bv_val = 0x0}, o_request = {oq_add = {rs_modlist = 0x0, rs_e = 0x0}, oq_bind = {rb_method = 0, rb_cred = {bv_len = 0, bv_val = 0x0}, rb_edn = {bv_len = 0, bv_val = 0x0}, rb_ssf = 0, rb_mech = { bv_len = 0, bv_val = 0x0}}, oq_compare = {rs_ava = 0x0}, oq_modify = {rs_mods = {rs_modlist = 0x0, rs_no_opattrs = 0 '\000'}, rs_increment = 0}, oq_modrdn = {rs_mods = {rs_modlist = 0x0, rs_no_opattrs = 0 '\000'}, rs_deleteoldrdn = 0, rs_newrdn = {bv_len = 0, bv_val = 0x0}, rs_nnewrdn = { bv_len = 0, bv_val = 0x0}, rs_newSup = 0x0, rs_nnewSup = 0x0, rs_newDN = {bv_len = 0, bv_val = 0x0}, rs_nnewDN = {bv_len = 0, bv_val = 0x0}}, oq_search = {rs_scope = 0, rs_deref = 0, rs_slimit = 0, rs_tlimit = 0, rs_limit = 0x0, rs_attrsonly = 0, rs_attrs = 0x0, rs_filter = 0x0, rs_filterstr = {bv_len = 0, bv_val = 0x0}}, oq_abandon = {rs_msgid = 0}, oq_cancel = {rs_msgid = 0}, oq_extended = {rs_reqoid = {bv_len = 0, bv_val = 0x0}, rs_flags = 0, rs_reqdata = 0x0}, oq_pwdexop = {rs_extended = {rs_reqoid = {bv_len = 0, bv_val = 0x0}, rs_flags = 0, rs_reqdata = 0x0}, rs_old = {bv_len = 0, bv_val = 0x0}, rs_new = {bv_len = 0, bv_val = 0x0}, rs_mods = 0x0, rs_modtail = 0x0}}, o_abandon = 0, o_cancel = 0, o_groups = 0x0, o_do_not_cache = 0 '\000', o_is_auth_check = 0 '\000', o_dont_replicate = 0 '\000', o_acl_priv = ACL_NONE, o_nocaching = 0 '\000', o_delete_glue_parent = 0 '\000', o_no_schema_check = 0 '\000', o_no_subordinate_glue = 0 '\000', o_ctrlflag = '\000' <repeats 31 times>, o_controls = 0x7ffaea7fb900, o_authz = {sai_method = 0, sai_mech = {bv_len = 0, bv_val = 0x0}, sai_dn = { bv_len = 0, bv_val = 0x0}, sai_ndn = {bv_len = 0, bv_val = 0x0}, sai_ssf = 0, sai_transport_ssf = 0, sai_tls_ssf = 0, sai_sasl_ssf = 0}, o_ber = 0x0, o_res_ber = 0x0, o_callback = 0x0, o_ctrls = 0x0, o_csn = {bv_len = 0, bv_val = 0x0}, o_private = 0x0, o_extra = {slh_first = 0x0}, o_next = {stqe_next = 0x0}}, ob_hdr = {oh_opid = 0, oh_connid = 18446744073709551615, oh_conn = 0x7ffaea7fb380, oh_msgid = 0, oh_protocol = 0, oh_tid = 140715652794112, oh_threadctx = 0x7ffaea7fba80, oh_tmpmemctx = 0x7ffae40008c0, oh_tmpmfuncs = 0x5555558a9d80 <slap_sl_mfuncs>, oh_counters = 0x5555558b4d80 <slap_counters>, oh_log_prefix = "conn=-1 op=0", '\000' <repeats 243 times>}, ob_controls = {0x0 <repeats 32 times>}} op = 0x7ffaea7fb620 curs = 0x0 key = {mv_size = 8, mv_data = 0x0} data = {mv_size = 0, mv_data = 0x0} txn = 0x0 id = 0 e = 0x0 rc = <optimized out> getnext = 1 i = <optimized out> first = 1 intr = 0 #1 0x00007ffff7bc20f3 in ldap_int_thread_pool_wrapper (xpool=0x55555593a8c0) at tpool.c:1053 pq = 0x55555593a8c0 pool = 0x55555593a7c0 task = 0x7ffaec0008c0 work_list = <optimized out> ctx = {ltu_pq = 0x55555593a8c0, ltu_id = 140715652794112, ltu_key = {{ltk_key = 0x5555555efe80 <slap_sl_mem_init>, ltk_data = 0x7ffae40008c0, ltk_free = 0x5555555efd40 <slap_sl_mem_destroy>}, {ltk_key = 0x0, ltk_data = 0x0, ltk_free = 0x0} <repeats 28 times>, {ltk_key = 0x0, ltk_data = 0x80, ltk_free = 0x7ffff6ec8d61 <_L_unlock_3056+19>}, {ltk_key = 0x0, ltk_data = 0x0, ltk_free = 0x0}, {ltk_key = 0x0, ltk_data = 0x0, ltk_free = 0x0}}} kctx = <optimized out> keyslot = <optimized out> hash = <optimized out> pool_lock = 0 freeme = 0 __PRETTY_FUNCTION__ = "ldap_int_thread_pool_wrapper" #2 0x00007ffff6ec7ea5 in start_thread () from /usr/lib64/libpthread.so.0No symbol table info available. #3 0x00007ffff6bf0b0d in clone () from /usr/lib64/libc.so.6 No symbol table info available.
https://bugs.openldap.org/show_bug.cgi?id=9858
--- Comment #3 from Howard Chu hyc@openldap.org --- Looks like syncrepl caused the particular backend's config to change, forcing it to be closed and reopened. One of the changes affected index definitions, so the online indexer task was added to the runqueue. But because of the close and reopen it was running on a backend that was no longer open.
https://bugs.openldap.org/show_bug.cgi?id=9858
--- Comment #4 from Quanah Gibson-Mount quanah@openldap.org --- (In reply to Howard Chu from comment #3)
Looks like syncrepl caused the particular backend's config to change, forcing it to be closed and reopened. One of the changes affected index definitions, so the online indexer task was added to the runqueue. But because of the close and reopen it was running on a backend that was no longer open.
There was no configuration change at all. I can trivially trigger this behavior as I noted initially simply by making the cn=config database on a replicated node do a full REFRESH on cn=config. There have been no changes made to indexing on these systems in at least a year.
https://bugs.openldap.org/show_bug.cgi?id=9858
--- Comment #5 from Howard Chu hyc@openldap.org --- That's odd, since the online indexer task only runs because there has been an index config change.
https://bugs.openldap.org/show_bug.cgi?id=9858
--- Comment #6 from Howard Chu hyc@openldap.org --- Do an `mdb_dump -p -s ad2i` and an `mdb_dump -s ixck` on your DB and attach the output here
https://bugs.openldap.org/show_bug.cgi?id=9858
--- Comment #7 from Howard Chu hyc@openldap.org --- Also try this patch https://git.openldap.org/openldap/openldap/-/merge_requests/533
That isn't expected to fix everything but should cleanup the backtrace a bit.
https://bugs.openldap.org/show_bug.cgi?id=9858
--- Comment #8 from Quanah Gibson-Mount quanah@openldap.org --- After applying the patch, gdb backtrace is:
#0 mdb_online_index (ctx=<optimized out>, arg=0x555555eada10) at config.c:225 rtask = 0x555555eada10 be = 0x7fffffffdf60 mdb = 0x1 conn = {c_conn_state = SLAP_C_INVALID, c_conn_idx = -1, c_sd = 0, c_close_reason = 0x0, c_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, c_sb = 0x0, c_starttime = 0, c_activitytime = 0, c_connid = 18446744073709551615, c_peer_domain = {bv_len = 0, bv_val = 0x55555563be5b ""}, c_peer_name = {bv_len = 0, bv_val = 0x55555563be5b ""}, c_listener = 0x55555589f640 <dummy_list>, c_sasl_bind_mech = {bv_len = 0, bv_val = 0x0}, c_sasl_dn = {bv_len = 0, bv_val = 0x0}, c_sasl_authz_dn = {bv_len = 0, bv_val = 0x0}, c_authz_backend = 0x0, c_authz_cookie = 0x0, c_authz = { sai_method = 0, sai_mech = {bv_len = 0, bv_val = 0x0}, sai_dn = {bv_len = 0, bv_val = 0x0}, sai_ndn = {bv_len = 0, bv_val = 0x0}, sai_ssf = 0, sai_transport_ssf = 0, sai_tls_ssf = 0, sai_sasl_ssf = 0}, c_protocol = 0, c_ops = {stqh_first = 0x0, stqh_last = 0x0}, c_pending_ops = { stqh_first = 0x0, stqh_last = 0x0}, c_write1_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, c_write1_cv = {__data = {__lock = 0, __futex = 0, __total_seq = 0, __wakeup_seq = 0, __woken_seq = 0, __mutex = 0x0, __nwaiters = 0, __broadcast_seq = 0}, __size = '\000' <repeats 47 times>, __align = 0}, c_currentber = 0x0, c_writers = 0, c_writing = 0 '\000', c_sasl_bind_in_progress = 0 '\000', c_writewaiter = 0 '\000', c_is_tls = 0 '\000', c_needs_tls_accept = 0 '\000', c_sasl_layers = 0 '\000', c_sasl_done = 0 '\000', c_sasl_authctx = 0x0, c_sasl_sockctx = 0x0, c_sasl_extra = 0x0, c_sasl_cbind = 0x0, c_sasl_bindop = 0x0, c_txn = 0, c_txn_backend = 0x0, c_txn_ops = {stqh_first = 0x0, stqh_last = 0x0}, c_pagedresults_state = {ps_be = 0x0, ps_size = 0, ps_count = 0, ps_cookie = 0, ps_cookieval = {bv_len = 0, bv_val = 0x0}}, c_n_ops_received = 0, c_n_ops_executing = 0, c_n_ops_pending = 0, c_n_ops_completed = 0, c_n_ops_async = 0, c_n_get = 0, c_n_read = 0, c_n_write = 0, c_extensions = 0x0, c_clientfunc = 0x0, c_clientarg = 0x0, c_send_ldap_result = 0x5555555a58b0 <slap_send_ldap_result>, c_send_search_entry = 0x5555555a71e0 <slap_send_search_entry>, c_send_search_reference = 0x5555555a65d0 <slap_send_search_reference>, c_send_ldap_extended = 0x5555555a61c0 <slap_send_ldap_extended>, c_send_ldap_intermediate = 0x5555555a6440 <slap_send_ldap_intermediate>} opbuf = {ob_op = {o_hdr = 0x7ffaeaffc7b8, o_tag = 0, o_time = 1654204173, o_tincr = 134718, o_tusec = 271425, o_qtime = {tv_sec = 0, tv_usec = 0}, o_bd = 0x7fffffffdf60, o_req_dn = {bv_len = 0, bv_val = 0x0}, o_req_ndn = {bv_len = 0, bv_val = 0x0}, o_request = {oq_add = {rs_modlist = 0x0, rs_e = 0x0}, oq_bind = {rb_method = 0, rb_cred = {bv_len = 0, bv_val = 0x0}, rb_edn = {bv_len = 0, bv_val = 0x0}, rb_ssf = 0, rb_mech = { bv_len = 0, bv_val = 0x0}}, oq_compare = {rs_ava = 0x0}, oq_modify = {rs_mods = {rs_modlist = 0x0, rs_no_opattrs = 0 '\000'}, rs_increment = 0}, oq_modrdn = {rs_mods = {rs_modlist = 0x0, rs_no_opattrs = 0 '\000'}, rs_deleteoldrdn = 0, rs_newrdn = {bv_len = 0, bv_val = 0x0}, rs_nnewrdn = {bv_len = 0, bv_val = 0x0}, rs_newSup = 0x0, rs_nnewSup = 0x0, rs_newDN = {bv_len = 0, bv_val = 0x0}, rs_nnewDN = {bv_len = 0, bv_val = 0x0}}, oq_search = {rs_scope = 0, rs_deref = 0, rs_slimit = 0, rs_tlimit = 0, rs_limit = 0x0, rs_attrsonly = 0, rs_attrs = 0x0, rs_filter = 0x0, rs_filterstr = {bv_len = 0, bv_val = 0x0}}, oq_abandon = {rs_msgid = 0}, oq_cancel = { rs_msgid = 0}, oq_extended = {rs_reqoid = {bv_len = 0, bv_val = 0x0}, rs_flags = 0, rs_reqdata = 0x0}, oq_pwdexop = {rs_extended = { rs_reqoid = {bv_len = 0, bv_val = 0x0}, rs_flags = 0, rs_reqdata = 0x0}, rs_old = {bv_len = 0, bv_val = 0x0}, rs_new = {bv_len = 0, bv_val = 0x0}, rs_mods = 0x0, rs_modtail = 0x0}}, o_abandon = 0, o_cancel = 0, o_groups = 0x0, o_do_not_cache = 0 '\000', o_is_auth_check = 0 '\000', o_dont_replicate = 0 '\000', o_acl_priv = ACL_NONE, o_nocaching = 0 '\000', o_delete_glue_parent = 0 '\000', o_no_schema_check = 0 '\000', o_no_subordinate_glue = 0 '\000', o_ctrlflag = '\000' <repeats 31 times>, o_controls = 0x7ffaeaffc900, o_authz = { sai_method = 0, sai_mech = {bv_len = 0, bv_val = 0x0}, sai_dn = {bv_len = 0, bv_val = 0x0}, sai_ndn = {bv_len = 0, bv_val = 0x0}, sai_ssf = 0, sai_transport_ssf = 0, sai_tls_ssf = 0, sai_sasl_ssf = 0}, o_ber = 0x0, o_res_ber = 0x0, o_callback = 0x0, o_ctrls = 0x0, o_csn = {bv_len = 0, bv_val = 0x0}, o_private = 0x0, o_extra = {slh_first = 0x0}, o_next = {stqe_next = 0x0}}, ob_hdr = {oh_opid = 0, oh_connid = 18446744073709551615, oh_conn = 0x7ffaeaffc380, oh_msgid = 0, oh_protocol = 0, oh_tid = 140715661186816, oh_threadctx = 0x7ffaeaffca80, oh_tmpmemctx = 0x7ffae40008c0, oh_tmpmfuncs = 0x5555558a9d80 <slap_sl_mfuncs>, oh_counters = 0x5555558b4d80 <slap_counters>, oh_log_prefix = "conn=-1 op=0", '\000' <repeats 243 times>}, ob_controls = {0x0 <repeats 32 times>}} op = 0x7ffaeaffc620 curs = 0x0 key = {mv_size = 8, mv_data = 0x0} data = {mv_size = 0, mv_data = 0x0} txn = 0x0 id = 0 e = 0x0 rc = <optimized out> getnext = 1 i = <optimized out> first = 1 intr = 0 #1 0x00007ffff7bc20f3 in ldap_int_thread_pool_wrapper (xpool=0x55555593a880) at tpool.c:1053 pq = 0x55555593a880 pool = 0x55555593a7a0 task = 0x7ffaec0008c0 work_list = <optimized out> ctx = {ltu_pq = 0x55555593a880, ltu_id = 140715661186816, ltu_key = {{ltk_key = 0x5555555efe80 <slap_sl_mem_init>, ltk_data = 0x7ffae40008c0, ltk_free = 0x5555555efd40 <slap_sl_mem_destroy>}, {ltk_key = 0x0, ltk_data = 0x0, ltk_free = 0x0} <repeats 28 times>, {ltk_key = 0x0, ltk_data = 0x80, ltk_free = 0x7ffff6ec8d61 <_L_unlock_3056+19>}, {ltk_key = 0x0, ltk_data = 0x0, ltk_free = 0x0}, {ltk_key = 0x0, ltk_data = 0x0, ltk_free = 0x0}}} kctx = <optimized out> keyslot = <optimized out> hash = <optimized out> pool_lock = 0 freeme = 0 __PRETTY_FUNCTION__ = "ldap_int_thread_pool_wrapper" #2 0x00007ffff6ec7ea5 in start_thread () from /usr/lib64/libpthread.so.0 No symbol table info available. #3 0x00007ffff6bf0b0d in clone () from /usr/lib64/libc.so.6 No symbol table info available.
https://bugs.openldap.org/show_bug.cgi?id=9858
--- Comment #9 from Quanah Gibson-Mount quanah@openldap.org --- threads are:
Thread 5 (Thread 0x7ffaeaffd700 (LWP 30534)): #0 mdb_online_index (ctx=<optimized out>, arg=0x555555eada10) at config.c:225 #1 0x00007ffff7bc20f3 in ldap_int_thread_pool_wrapper (xpool=0x55555593a880) at tpool.c:1053 #2 0x00007ffff6ec7ea5 in start_thread () from /usr/lib64/libpthread.so.0 #3 0x00007ffff6bf0b0d in clone () from /usr/lib64/libc.so.6
Thread 4 (Thread 0x7ffaeb7fe700 (LWP 30533)): #0 0x00007ffff6ece54d in __lll_lock_wait () from /usr/lib64/libpthread.so.0 #1 0x00007ffff6ec9e9b in _L_lock_883 () from /usr/lib64/libpthread.so.0 #2 0x00007ffff6ec9d68 in pthread_mutex_lock () from /usr/lib64/libpthread.so.0 #3 0x00007ffff7bc2a15 in ldap_pvt_thread_mutex_lock (mutex=<optimized out>) at thr_posix.c:313 #4 0x0000555555623148 in monitor_cache_get (mi=mi@entry=0x5555558b07e0 <monitor_info>, ndn=ndn@entry=0x7ffaeb7fce10, ep=ep@entry=0x7ffaeb7fce08) at cache.c:207 #5 0x000055555561ad29 in monitor_back_register_entry (e=0x5555559633c8, cb=0x7ffadc001230, mss=0x0, flags=0) at init.c:498 #6 0x00005555556015eb in syncrepl_monitor_add (si=0x555555a95630) at syncrepl.c:7191 #7 do_syncrepl (ctx=0x7ffaeb7fda80, arg=0x555555a93d30) at syncrepl.c:2051 #8 0x00007ffff7bc20f3 in ldap_int_thread_pool_wrapper (xpool=0x55555593a880) at tpool.c:1053 #9 0x00007ffff6ec7ea5 in start_thread () from /usr/lib64/libpthread.so.0 #10 0x00007ffff6bf0b0d in clone () from /usr/lib64/libc.so.6
Thread 3 (Thread 0x7ffaebfff700 (LWP 30532)): #0 connection_fake_init2 (conn=conn@entry=0x7ffaebffe270, opbuf=opbuf@entry=0x7ffaebffe510, ctx=ctx@entry=0x7ffaebffea80, newmem=newmem@entry=1) at connection.c:2072 #1 0x000055555559670a in connection_fake_init (conn=conn@entry=0x7ffaebffe270, opbuf=opbuf@entry=0x7ffaebffe510, ctx=ctx@entry=0x7ffaebffea80) at connection.c:2037 #2 0x00005555555fff94 in do_syncrepl (ctx=0x7ffaebffea80, arg=0x555555a2f250) at syncrepl.c:2096 #3 0x00007ffff7bc20f3 in ldap_int_thread_pool_wrapper (xpool=0x55555593a880) at tpool.c:1053 #4 0x00007ffff6ec7ea5 in start_thread () from /usr/lib64/libpthread.so.0 #5 0x00007ffff6bf0b0d in clone () from /usr/lib64/libc.so.6
Thread 2 (Thread 0x7ffaf0b72700 (LWP 30531)): #0 0x00007ffff6bf10e3 in epoll_wait () from /usr/lib64/libc.so.6 #1 0x000055555558edb4 in slapd_daemon_task (ptr=<optimized out>) at daemon.c:2845 #2 0x00007ffff6ec7ea5 in start_thread () from /usr/lib64/libpthread.so.0 #3 0x00007ffff6bf0b0d in clone () from /usr/lib64/libc.so.6
Thread 1 (Thread 0x7ffff7fe0880 (LWP 30527)): #0 0x00007ffff6ec9017 in pthread_join () from /usr/lib64/libpthread.so.0 #1 0x00007ffff7bc2975 in ldap_pvt_thread_join (thread=<optimized out>, thread_return=<optimized out>) at thr_posix.c:214 #2 0x0000555555591afa in slapd_daemon () at daemon.c:3378 #3 0x0000555555575cba in main (argc=<optimized out>, argv=<optimized out>) at main.c:873
https://bugs.openldap.org/show_bug.cgi?id=9858
--- Comment #10 from Quanah Gibson-Mount quanah@openldap.org --- Created attachment 902 --> https://bugs.openldap.org/attachment.cgi?id=902&action=edit a2di mdb dump
https://bugs.openldap.org/show_bug.cgi?id=9858
--- Comment #11 from Quanah Gibson-Mount quanah@openldap.org --- Created attachment 903 --> https://bugs.openldap.org/attachment.cgi?id=903&action=edit ixck mdb dump
https://bugs.openldap.org/show_bug.cgi?id=9858
--- Comment #12 from Howard Chu hyc@openldap.org --- I didn't ask for '-p' on the ixck dump.
Anyway, the ixck DB should be empty. The fact that it's not means that it had unfinished background indexing tasks, which it tried to run on the next startup. That explains why the online index task is being run.
https://bugs.openldap.org/show_bug.cgi?id=9858
--- Comment #13 from Quanah Gibson-Mount quanah@openldap.org --- (In reply to Howard Chu from comment #12)
I didn't ask for '-p' on the ixck dump.
Anyway, the ixck DB should be empty. The fact that it's not means that it had unfinished background indexing tasks, which it tried to run on the next startup. That explains why the online index task is being run.
Ok, so the bug seems to be that an indexing task is being scheduled incorrectly when the config db refreshes.
https://bugs.openldap.org/show_bug.cgi?id=9858
--- Comment #14 from Howard Chu hyc@openldap.org --- You're going to need the mdb_drop tool, which isn't in the release. It's in ec3e4ed9d80b42743d5e87e1eba553c500334fc5
mdb_drop -s ixck
to empty the ixck DB. Looks like its contents are invalid, anyway.
https://bugs.openldap.org/show_bug.cgi?id=9858
--- Comment #15 from Howard Chu hyc@openldap.org --- After you empty the ixck DB, see if the next config refresh causes it to be populated again. That seems most likely to be what's happening.
https://bugs.openldap.org/show_bug.cgi?id=9858
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED Keywords|needs_review |
--- Comment #16 from Quanah Gibson-Mount quanah@openldap.org --- head:
• 9e5701cd by Howard Chu at 2022-06-02T16:06:31+01:00 ITS#9858 back-mdb: delay indexer task startup
• 395e9b25 by Howard Chu at 2022-06-03T17:06:39+01:00 ITS#9858 back-mdb: fix index reconfig
RE26:
• ac061c68 by Howard Chu at 2022-06-03T16:55:45+00:00 ITS#9858 back-mdb: delay indexer task startup
• c43c7a93 by Howard Chu at 2022-06-03T16:55:59+00:00 ITS#9858 back-mdb: fix index reconfig
https://bugs.openldap.org/show_bug.cgi?id=9858
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Assignee|bugs@openldap.org |hyc@openldap.org
https://bugs.openldap.org/show_bug.cgi?id=9858
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |VERIFIED