Hi again!
I managed to do a full DIT sync with my configuration, but I noticed two things that seem odd to me:
And as a matter of fact, after having synced the second node successfully, slapd crashed after some delay when there was no requests (except syncrepl) being processed.
Logs like this:
Node 5 (actively managed):
Apr 24 14:11:07 v05 slapd[15418]: do_syncrep1: rid=106 starting refresh (sending cookie=rid=106,sid=005,csn=20130719093756.074776Z#000000#000#000000;20250217105250.345944Z#000000#001#000000;20250218171739.629994Z#000000#002#000000;20250217065706.238392Z#000000#003#000000;20250227092327.859231Z#000000#005#000000;20250424114618.458377Z#000000#006#000000)
Apr 24 14:11:08 v05 slapd[15418]: do_syncrepl: rid=106 rc -1 retrying (3 retries left)
Node 6 (being synced):
Apr 24 14:08:57 v06 slapd[9902]: accesslog_response: adding a new csn=20250424114618.458377Z#000000#006#000000 into minCSN
Apr 24 14:08:57 v06 slapd[9902]: slap_graduate_commit_csn: removing 0x7fd22079a930 20250424114618.458377Z#000000#006#000000
Apr 24 14:08:57 v06 slapd[9902]: do_syncrep2: rid=105 LDAP_RES_INTERMEDIATE - SYNC_ID_SET
Apr 24 14:08:57 v06 slapd[9902]: do_syncrep2: rid=105 LDAP_RES_INTERMEDIATE - REFRESH_PRESENT
Apr 24 14:11:07 v06 slapd[9902]: conn=1004 op=2 syncprov_op_search: got a persistent search with a cookie=rid=106,sid=005,csn=20130719093756.074776Z#000000#000#000000;20250217105250.345944Z#000000#001#000000;20250218171739.629994Z#000000#002#000000;20250217065706.238392Z#000000#003#000000;20250227092327.859231Z#000000#005#000000;20250424114618.458377Z#000000#006#000000
Apr 24 14:11:07 v06 slapd[9902]: conn=1004 op=2 syncprov_findbase: searching
Apr 24 14:11:07 v06 slapd[9902]: conn=1004 op=2 syncprov_op_search: registered persistent search
Apr 24 14:11:07 v06 systemd[1]: Started Process Core Dump (PID 9920/UID 0).
Apr 24 14:11:08 v06 systemd-coredump[9921]: [🡕] Process 9902 (slapd) of user 76 dumped core.
Stack trace of thread 9908:
#0 0x00007fd2476a8dfc __pthread_kill_implementation (libc.so.6 + 0xa8dfc)
#1 0x00007fd247657842 raise (libc.so.6 + 0x57842)
#2 0x00007fd24763f5cf abort (libc.so.6 + 0x3f5cf)
#3 0x00007fd24763f4e7 __assert_fail_base.cold (libc.so.6 + 0x3f4e7)
#4 0x00007fd24764fb32 __assert_fail (libc.so.6 + 0x4fb32)
#5 0x00007fd246a25218 n/a (syncprov.so + 0xc218)
#6 0x000055e3e0f504f3 overlay_op_walk (slapd + 0xb74f3)
#7 0x000055e3e0f506be n/a (slapd + 0xb76be)
#8 0x000055e3e0edee54 fe_op_search (slapd + 0x45e54)
#9 0x000055e3e0ede726 do_search (slapd + 0x45726)
#10 0x000055e3e0edc18f n/a (slapd + 0x4318f)
#11 0x000055e3e0edc98d n/a (slapd + 0x4398d)
#12 0x00007fd24791cda0 n/a (libldap-2.5.releng.so.0 + 0x48da0)
#13 0x00007fd2476a6f6c start_thread (libc.so.6 + 0xa6f6c)
#14 0x00007fd24772e338 __clone3 (libc.so.6 + 0x12e338)
Stack trace of thread 9902:
#0 0x00007fd2476a3c1e __futex_abstimed_wait_common (libc.so.6 + 0xa3c1e)
#1 0x00007fd2476a8b83 __pthread_clockjoin_ex (libc.so.6 + 0xa8b83)
#2 0x000055e3e0ed91ca slapd_daemon (slapd + 0x401ca)
#3 0x000055e3e0ebc88e main (slapd + 0x2388e)
#4 0x00007fd247640e6c __libc_start_call_main (libc.so.6 + 0x40e6c)
#5 0x00007fd247640f35 __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x40f35)
#6 0x000055e3e0ebc90a _start (slapd + 0x2390a)
Stack trace of thread 9906:
#0 0x00007fd2476a3c1e __futex_abstimed_wait_common (libc.so.6 + 0xa3c1e)
#1 0x00007fd2476a6548 pthread_cond_wait@@GLIBC_2.3.2 (libc.so.6 + 0xa6548)
#2 0x00007fd24791ce40 n/a (libldap-2.5.releng.so.0 + 0x48e40)
#3 0x00007fd2476a6f6c start_thread (libc.so.6 + 0xa6f6c)
#4 0x00007fd24772e338 __clone3 (libc.so.6 + 0x12e338)
Stack trace of thread 9907:
#0 0x00007fd2476a3c1e __futex_abstimed_wait_common (libc.so.6 + 0xa3c1e)
#1 0x00007fd2476a6548 pthread_cond_wait@@GLIBC_2.3.2 (libc.so.6 + 0xa6548)
#2 0x00007fd24791ce40 n/a (libldap-2.5.releng.so.0 + 0x48e40)
#3 0x00007fd2476a6f6c start_thread (libc.so.6 + 0xa6f6c)
#4 0x00007fd24772e338 __clone3 (libc.so.6 + 0x12e338)
Stack trace of thread 9909:
#0 0x00007fd2476a3c1e __futex_abstimed_wait_common (libc.so.6 + 0xa3c1e)
#1 0x00007fd2476a6548 pthread_cond_wait@@GLIBC_2.3.2 (libc.so.6 + 0xa6548)
#2 0x00007fd24791ce40 n/a (libldap-2.5.releng.so.0 + 0x48e40)
#3 0x00007fd2476a6f6c start_thread (libc.so.6 + 0xa6f6c)
#4 0x00007fd24772e338 __clone3 (libc.so.6 + 0x12e338)
Stack trace of thread 9910:
#0 0x00007fd2476a3c1e __futex_abstimed_wait_common (libc.so.6 + 0xa3c1e)
#1 0x00007fd2476a6548 pthread_cond_wait@@GLIBC_2.3.2 (libc.so.6 + 0xa6548)
#2 0x00007fd24791ce40 n/a (libldap-2.5.releng.so.0 + 0x48e40)
#3 0x00007fd2476a6f6c start_thread (libc.so.6 + 0xa6f6c)
#4 0x00007fd24772e338 __clone3 (libc.so.6 + 0x12e338)
Stack trace of thread 9911:
#0 0x00007fd2476a3c1e __futex_abstimed_wait_common (libc.so.6 + 0xa3c1e)
#1 0x00007fd2476a6548 pthread_cond_wait@@GLIBC_2.3.2 (libc.so.6 + 0xa6548)
#2 0x00007fd24791ce40 n/a (libldap-2.5.releng.so.0 + 0x48e40)
#3 0x00007fd2476a6f6c start_thread (libc.so.6 + 0xa6f6c)
#4 0x00007fd24772e338 __clone3 (libc.so.6 + 0x12e338)
Stack trace of thread 9912:
#0 0x00007fd2476a3c1e __futex_abstimed_wait_common (libc.so.6 + 0xa3c1e)
#1 0x00007fd2476a6548 pthread_cond_wait@@GLIBC_2.3.2 (libc.so.6 + 0xa6548)
#2 0x00007fd24791ce40 n/a (libldap-2.5.releng.so.0 + 0x48e40)
#3 0x00007fd2476a6f6c start_thread (libc.so.6 + 0xa6f6c)
#4 0x00007fd24772e338 __clone3 (libc.so.6 + 0x12e338)
Stack trace of thread 9904:
#0 0x00007fd24772e796 epoll_wait (libc.so.6 + 0x12e796)
#1 0x000055e3e0ed607b n/a (slapd + 0x3d07b)
#2 0x00007fd2476a6f6c start_thread (libc.so.6 + 0xa6f6c)
#3 0x00007fd24772e338 __clone3 (libc.so.6 + 0x12e338)
ELF object binary architecture: AMD x86-64
Apr 24 14:11:08 v06 systemd[1]: systemd-coredump@1-9920-0.service: Deactivated successfully.
Apr 24 14:11:08 v06 systemd[1]: slapd.service: Main process exited, code=dumped, status=6/ABRT
Apr 24 14:11:08 v06 systemd[1]: slapd.service: Failed with result 'core-dump'.
Apr 24 14:11:08 v06 systemd[1]: slapd.service: Consumed 1min 43.162s CPU time.
Rid=106 is cn=config from node 6, BTW.
Kind regards,
Ulrich Windl
From: Windl, Ulrich <u.windl@ukr.de>
Sent: Thursday, April 24, 2025 11:44 AM
To: openldap-technical@openldap.org
Subject: [EXT] RE: core dump when using olcSpSessionlogSource
Update: When I removed the accesslog databases, the problem went away.
Still someone should have a look, I think.
Kind regards,
Ulrich Windl
From: Windl, Ulrich <u.windl@ukr.de>
Sent: Thursday, April 24, 2025 11:39 AM
To: openldap-technical@openldap.org
Subject: [EXT] core dump when using olcSpSessionlogSource
Hi!
Configuring delta-syncrepl, I tried olcSpSessionlogSource instead of olcSPSessionlog.
After slapd war started, it dumped core like this:
…
Apr 24 11:25:41 v05 slapd[5030]: syncprov_db_open: starting syncprov for suffix cn=config
Apr 24 11:25:41 v05 slapd[5030]: conn=-1 op=0 syncprov_findcsn: mode=FIND_MAXCSN csn=
Apr 24 11:25:41 v05 slapd[5030]: syncprov_db_open: starting syncprov for suffix dc=sap,dc=klinik,dc=uni-regensburg,dc=de
Apr 24 11:25:41 v05 slapd[5030]: slapd starting
Apr 24 11:25:41 v05 start[5017]: Starting ldap-server
Apr 24 11:25:41 v05 systemd[1]: Started OpenLDAP Server Daemon.
Apr 24 11:25:41 v05 slapd[5030]: slap_get_csn: conn=1000 op=1 generated new csn=20250424092541.888430Z#000000#005#000000 manage=1
Apr 24 11:25:41 v05 slapd[5030]: slap_queue_csn: queueing 0x7fd1a01217d0 20250424092541.888430Z#000000#005#000000
Apr 24 11:25:41 v05 slapd[5030]: slap_graduate_commit_csn: removing 0x7fd1a01217d0 20250424092541.888430Z#000000#005#000000
Apr 24 11:25:41 v05 ldapmodify[5035]: DIGEST-MD5 common mech free
Apr 24 11:25:41 v05 slapd[5030]: slap_get_csn: conn=1001 op=1 generated new csn=20250424092541.896113Z#000000#005#000000 manage=1
Apr 24 11:25:41 v05 slapd[5030]: slap_queue_csn: queueing 0x7fd1a0122df0 20250424092541.896113Z#000000#005#000000
Apr 24 11:25:41 v05 slapd[5030]: conn=1001 op=1 syncprov_checkpoint: running checkpoint
Apr 24 11:25:41 v05 slapd[5030]: slap_graduate_commit_csn: removing 0x7fd1a0122df0 20250424092541.896113Z#000000#005#000000
Apr 24 11:25:41 v05 ldapmodify[5037]: DIGEST-MD5 common mech free
Apr 24 11:25:41 v05 slapd[5030]: slap_get_csn: conn=1002 op=1 generated new csn=20250424092541.900818Z#000000#005#000000 manage=1
Apr 24 11:25:41 v05 slapd[5030]: slap_queue_csn: queueing 0x7fd1a0128240 20250424092541.900818Z#000000#005#000000
Apr 24 11:25:41 v05 slapd[5030]: slap_queue_csn: queueing 0x7fd1a012b820 20250424092541.900818Z#000000#005#000000
Apr 24 11:25:41 v05 slapd[5030]: slap_graduate_commit_csn: removing 0x7fd1a012b820 20250424092541.900818Z#000000#005#000000
Apr 24 11:25:41 v05 slapd[5030]: slap_graduate_commit_csn: removing 0x7fd1a0128240 20250424092541.900818Z#000000#005#000000
Apr 24 11:25:41 v05 ldapmodify[5039]: DIGEST-MD5 common mech free
Apr 24 11:25:41 v05 slapd[5030]: slap_get_csn: conn=1003 op=1 generated new csn=20250424092541.920778Z#000000#005#000000 manage=1
Apr 24 11:25:41 v05 slapd[5030]: slap_queue_csn: queueing 0x7fd19c106d00 20250424092541.920778Z#000000#005#000000
Apr 24 11:25:41 v05 slapd[5030]: slap_queue_csn: queueing 0x7fd19c107760 20250424092541.920778Z#000000#005#000000
Apr 24 11:25:41 v05 slapd[5030]: slap_graduate_commit_csn: removing 0x7fd19c107760 20250424092541.920778Z#000000#005#000000
Apr 24 11:25:41 v05 slapd[5030]: slap_graduate_commit_csn: removing 0x7fd19c106d00 20250424092541.920778Z#000000#005#000000
Apr 24 11:25:41 v05 ldapmodify[5041]: DIGEST-MD5 common mech free
Apr 24 11:25:41 v05 slapd[5030]: slap_get_csn: conn=1004 op=1 generated new csn=20250424092541.937140Z#000000#005#000000 manage=1
Apr 24 11:25:41 v05 slapd[5030]: slap_queue_csn: queueing 0x7fd19c10c9f0 20250424092541.937140Z#000000#005#000000
Apr 24 11:25:41 v05 slapd[5030]: slap_queue_csn: queueing 0x7fd19c11c110 20250424092541.937140Z#000000#005#000000
Apr 24 11:25:41 v05 slapd[5030]: slap_graduate_commit_csn: removing 0x7fd19c11c110 20250424092541.937140Z#000000#005#000000
Apr 24 11:25:41 v05 slapd[5030]: slap_graduate_commit_csn: removing 0x7fd19c10c9f0 20250424092541.937140Z#000000#005#000000
Apr 24 11:25:41 v05 ldapmodify[5043]: DIGEST-MD5 common mech free
Apr 24 11:25:41 v05 slapd[5030]: slap_get_csn: conn=1005 op=1 generated new csn=20250424092541.989098Z#000000#005#000000 manage=1
Apr 24 11:25:41 v05 slapd[5030]: slap_queue_csn: queueing 0x7fd1a012b920 20250424092541.989098Z#000000#005#000000
Apr 24 11:25:41 v05 slapd[5030]: slap_queue_csn: queueing 0x7fd1a0107320 20250424092541.989098Z#000000#005#000000
Apr 24 11:25:41 v05 slapd[5030]: slap_graduate_commit_csn: removing 0x7fd1a0107320 20250424092541.989098Z#000000#005#000000
Apr 24 11:25:41 v05 slapd[5030]: slap_graduate_commit_csn: removing 0x7fd1a012b920 20250424092541.989098Z#000000#005#000000
Apr 24 11:25:41 v05 ldapmodify[5044]: DIGEST-MD5 common mech free
Apr 24 11:25:42 v05 slapd[5030]: slap_get_csn: conn=1006 op=1 generated new csn=20250424092542.004124Z#000000#005#000000 manage=1
Apr 24 11:25:42 v05 slapd[5030]: slap_queue_csn: queueing 0x7fd1a01277f0 20250424092542.004124Z#000000#005#000000
Apr 24 11:25:42 v05 slapd[5030]: index objectClass 0x0004
Apr 24 11:25:42 v05 slapd[5030]: index entryCSN 0x0004
Apr 24 11:25:42 v05 slapd[5030]: index reqEnd 0x0004
Apr 24 11:25:42 v05 slapd[5030]: index reqResult 0x0004
Apr 24 11:25:42 v05 slapd[5030]: slap_queue_csn: queueing 0x7fd1a02376f0 20250424092542.004124Z#000000#005#000000
Apr 24 11:25:42 v05 slapd[5030]: slap_graduate_commit_csn: removing 0x7fd1a02376f0 20250424092542.004124Z#000000#005#000000
Apr 24 11:25:42 v05 slapd[5030]: slap_graduate_commit_csn: removing 0x7fd1a01277f0 20250424092542.004124Z#000000#005#000000
Apr 24 11:25:42 v05 ldapmodify[5046]: DIGEST-MD5 common mech free
Apr 24 11:25:42 v05 slapd[5030]: slap_get_csn: conn=1007 op=1 generated new csn=20250424092542.020297Z#000000#005#000000 manage=1
Apr 24 11:25:42 v05 slapd[5030]: slap_queue_csn: queueing 0x7fd1a010a230 20250424092542.020297Z#000000#005#000000
Apr 24 11:25:42 v05 slapd[5030]: index objectClass 0x0004
Apr 24 11:25:42 v05 slapd[5030]: index entryCSN 0x0004
Apr 24 11:25:42 v05 slapd[5030]: index reqEnd 0x0004
Apr 24 11:25:42 v05 slapd[5030]: index reqResult 0x0004
Apr 24 11:25:42 v05 slapd[5030]: slap_queue_csn: queueing 0x7fd1a0117770 20250424092542.020297Z#000000#005#000000
Apr 24 11:25:42 v05 slapd[5030]: slap_graduate_commit_csn: removing 0x7fd1a0117770 20250424092542.020297Z#000000#005#000000
Apr 24 11:25:42 v05 slapd[5030]: slap_graduate_commit_csn: removing 0x7fd1a010a230 20250424092542.020297Z#000000#005#000000
Apr 24 11:25:42 v05 ldapmodify[5048]: DIGEST-MD5 common mech free
Apr 24 11:25:42 v05 slapd[5030]: slap_get_csn: conn=1008 op=1 generated new csn=20250424092542.036954Z#000000#005#000000 manage=1
Apr 24 11:25:42 v05 slapd[5030]: slap_queue_csn: queueing 0x7fd1a0237570 20250424092542.036954Z#000000#005#000000
Apr 24 11:25:42 v05 slapd[5030]: syncprov_db_open: starting syncprov for suffix cn=changelog-0
Apr 24 11:25:42 v05 slapd[5030]: [71B blob data]
Apr 24 11:25:42 v05 slapd[5030]: olcSpSessionlogSource: value #0: <olcSpSessionlogSource> failed startup (cn=changelog-0)!
Apr 24 11:25:42 v05 systemd[1]: Started Process Core Dump (PID 5051/UID 0).
Apr 24 11:25:42 v05 systemd-coredump[5052]: [🡕] Process 5030 (slapd) of user 1000 dumped core.
Stack trace of thread 5034:
#0 0x00007fd1b82a8dfc __pthread_kill_implementation (libc.so.6 + 0xa8dfc)
#1 0x00007fd1b8257842 raise (libc.so.6 + 0x57842)
#2 0x00007fd1b823f5cf abort (libc.so.6 + 0x3f5cf)
#3 0x00007fd1b8240497 __libc_message.cold (libc.so.6 + 0x40497)
#4 0x00007fd1b82b34b7 malloc_printerr (libc.so.6 + 0xb34b7)
#5 0x00007fd1b82b5a69 _int_free (libc.so.6 + 0xb5a69)
#6 0x00007fd1b82b81df free (libc.so.6 + 0xb81df)
#7 0x00007fd1b8159bee n/a (syncprov.so + 0x3bee)
#8 0x000056122f1a02ee overlay_destroy_one (slapd + 0xb82ee)
#9 0x000056122f1156bd n/a (slapd + 0x2d6bd)
#10 0x000056122f115fb0 n/a (slapd + 0x2dfb0)
#11 0x000056122f19f533 overlay_op_walk (slapd + 0xb7533)
#12 0x000056122f19f6be n/a (slapd + 0xb76be)
#13 0x000056122f1319ad fe_op_add (slapd + 0x499ad)
#14 0x000056122f1328d9 do_add (slapd + 0x4a8d9)
#15 0x000056122f12b18f n/a (slapd + 0x4318f)
#16 0x000056122f12b98d n/a (slapd + 0x4398d)
#17 0x00007fd1b86bcda0 n/a (libldap-2.5.releng.so.0 + 0x48da0)
#18 0x00007fd1b82a6f6c start_thread (libc.so.6 + 0xa6f6c)
#19 0x00007fd1b832e338 __clone3 (libc.so.6 + 0x12e338)
Stack trace of thread 5030:
#0 0x00007fd1b82a3c1e __futex_abstimed_wait_common (libc.so.6 + 0xa3c1e)
#1 0x00007fd1b82a8b83 __pthread_clockjoin_ex (libc.so.6 + 0xa8b83)
#2 0x000056122f1281ca slapd_daemon (slapd + 0x401ca)
#3 0x000056122f10b88e main (slapd + 0x2388e)
#4 0x00007fd1b8240e6c __libc_start_call_main (libc.so.6 + 0x40e6c)
#5 0x00007fd1b8240f35 __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x40f35)
#6 0x000056122f10b90a _start (slapd + 0x2390a)
Stack trace of thread 5038:
#0 0x00007fd1b82a3c1e __futex_abstimed_wait_common (libc.so.6 + 0xa3c1e)
#1 0x00007fd1b82a6548 pthread_cond_wait@@GLIBC_2.3.2 (libc.so.6 + 0xa6548)
#2 0x00007fd1b86bce40 n/a (libldap-2.5.releng.so.0 + 0x48e40)
#3 0x00007fd1b82a6f6c start_thread (libc.so.6 + 0xa6f6c)
#4 0x00007fd1b832e338 __clone3 (libc.so.6 + 0x12e338)
Stack trace of thread 5032:
#0 0x00007fd1b832e796 epoll_wait (libc.so.6 + 0x12e796)
#1 0x000056122f12507b n/a (slapd + 0x3d07b)
#2 0x00007fd1b82a6f6c start_thread (libc.so.6 + 0xa6f6c)
#3 0x00007fd1b832e338 __clone3 (libc.so.6 + 0x12e338)
ELF object binary architecture: AMD x86-64
Apr 24 11:25:42 v05 ldapmodify[5050]: DIGEST-MD5 common mech free
Apr 24 11:25:42 v05 systemd[1]: slapd.service: Main process exited, code=dumped, status=6/ABRT
Apr 24 11:25:42 v05 systemd[1]: slapd.service: Failed with result 'core-dump'.
Apr 24 11:25:42 v05 systemd[1]: systemd-coredump@1-5051-0.service: Deactivated successfully.
Looks a bit like a bad free() to me.
I was adding the syncprov overlay with the accesslog DB being defined, but the accesslog overlay to set “olcAccessLogDB: cn=changelog-0” wasn’t done yet. The changelog-0 database may have contained outdated entries.
Kind regards,
Ulrich Windl