Hello,
we fighting since upgrade from Buster to Bookworm with smaller and bigger issues on our OpenLDAP. We use WebADM as IDM (Rcdevs) and this is using OpenLDAP as backend. Since a long while on Bookworm, we have the issues, that slapd stucks on operations, like on adding entries. For example adding more than 1 CN entry to an existing OU. The only way to get all working is again, to stop slapd, but systemctl stop slapd doesn't work, you have to use kill -9 .. and that, pretty often.
So, I hoped, to get it working again, I cloned the VMs; cutted the (normal) network and used a localhost bridge, so that both can see each others, without issues. Then I've created a backup (slapcat); deleted the db and slapd.d/cn=config ... and restored on both the DB. This part worked without issues .. but:
``` cat /home/foo/sudo_single.ldif
dn: cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=example,dc=local objectclass: sudorole objectclass: top cn: jochoa_fra_dev_bookworm_02 sudorunasuser: ALL sudooption: !authenticate sudocommand: /bin/su sudohost: fra-dev-bookworm-02.example.local sudouser: jochoa@example.local
ldapadd -ZZ -c -x -D 'cn=webadmin,ou=Accounts,dc=example,dc=local' -W -H ldap://fra-corp-auth-01.example.com:389 -f /home/foo/sudo_single.ldif -vv
ldap_initialize( ldap://fra-corp-auth-01.example.com:389/??base ) Enter LDAP Password: add objectclass: sudorole top add cn: jochoa_fra_dev_bookworm_02 add sudorunasuser: ALL add sudooption: !authenticate add sudocommand: /bin/su add sudohost: fra-dev-bookworm-02.example.local add sudouser: jochoa@example.local adding new entry "cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=example,dc=local" ```
and then .. it just stucks, till I break with CTRL +C
The same happens via ApacheDirectory or using WebADM Gui ... sometimes it works .. but often not.
```` eb 20 11:20:22 fra-corp-auth-01 slapd[710]: daemon: activity on 1 descriptor Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: daemon: activity on: Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: 22r Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: daemon: read active on 22 Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: daemon: epoll: listen=8 active_threads=0 tvp=zero Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: daemon: epoll: listen=9 active_threads=0 tvp=zero Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: daemon: epoll: listen=10 active_threads=0 tvp=zero Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: connection_get(22) Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: connection_get(22): got connid=1041 Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: connection_read(22): checking for input on id=1041 Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: op tag 0x68, time 1740046822 Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: conn=1041 op=1 do_add Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: conn=1041 op=1 do_add: dn (cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=example,dc=local) Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: >>> dnPrettyNormal: <cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=example,dc=local> Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: <<< dnPrettyNormal: <cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=example,dc=local>, <cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=example,dc=local> Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: conn=1041 op=1 ADD dn="cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=example,dc=local" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: => mdb_entry_get: ndn: "cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=example,dc=local" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: => mdb_entry_get: oc: "(null)", at: "(null)" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: mdb_dn2entry("cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=example,dc=local") Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: => mdb_dn2id("cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=example,dc=local") Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: <= mdb_dn2id: get failed: MDB_NOTFOUND: No matching key/data pair found (-30798) Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: => mdb_entry_get: cannot find entry: "cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=example,dc=local" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: mdb_entry_get: rc=32 Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: ==> mdb_add: cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=example,dc=local Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: oc_check_required entry (cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=example,dc=local), objectClass "sudoRole" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: oc_check_allowed type "objectClass" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: oc_check_allowed type "cn" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: oc_check_allowed type "sudoRunAsUser" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: oc_check_allowed type "sudoOption" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: oc_check_allowed type "sudoCommand" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: oc_check_allowed type "sudoHost" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: oc_check_allowed type "sudoUser" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: oc_check_allowed type "structuralObjectClass" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: daemon: activity on 1 descriptor Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: daemon: activity on: Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: daemon: epoll: listen=8 active_threads=0 tvp=zero ```
If I try to stop the slapd om ldap1:
``` Feb 20 11:23:10 fra-corp-auth-01 slapd[710]: conn=1001 fd=20 closed (slapd shutdown) Feb 20 11:23:10 fra-corp-auth-01 slapd[710]: connection_closing: readying conn=1041 sd=22 for close Feb 20 11:23:10 fra-corp-auth-01 slapd[710]: connection_close: deferring conn=1041 sd=22 Feb 20 11:23:10 fra-corp-auth-01 slapd[710]: connection_closing: readying conn=1011 sd=23 for close Feb 20 11:23:10 fra-corp-auth-01 slapd[710]: connection_close: deferring conn=1011 sd=23 Feb 20 11:23:10 fra-corp-auth-01 slapd[710]: slapd shutdown: waiting for 4 operations/tasks to finish ```
strace shows:
``` futex(0x7f3e9a9ff990, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 720, NULL, FUTEX_BITSET_MATCH_ANY ```
So, if I stop all .. start slapd again .. all seems fine ..
* ldap2
``` Feb 20 11:49:32 fra-corp-auth-02 slapd[686]: conn=1209 op=2 syncprov_op_search: registered persistent search Feb 20 11:49:32 fra-corp-auth-02 slapd[686]: conn=1209 op=2 syncprov_op_search: no change, skipping log replay Feb 20 11:49:32 fra-corp-auth-02 slapd[686]: conn=1209 op=2 syncprov_op_search: nothing changed, finishing up initial search early Feb 20 11:49:32 fra-corp-auth-02 slapd[686]: conn=1209 op=2 syncprov_sendinfo: refreshDelete cookie= Feb 20 11:49:32 fra-corp-auth-02 slapd[686]: conn=1209 op=2 syncprov_search_response: detaching op ```
then I again try to use ldapadd .. and I see still:
* ldap2
``` ... Feb 20 11:54:35 fra-corp-auth-02 slapd[4696]: =>do_syncrepl rid=002 Feb 20 11:54:36 fra-corp-auth-02 slapd[4696]: daemon: epoll: listen=8 active_threads=0 tvp=zero Feb 20 11:54:36 fra-corp-auth-02 slapd[4696]: daemon: epoll: listen=9 active_threads=0 tvp=zero Feb 20 11:54:36 fra-corp-auth-02 slapd[4696]: daemon: epoll: listen=10 active_threads=0 tvp=zero Feb 20 11:54:36 fra-corp-auth-02 slapd[4696]: start_refresh: rid=002 a refresh on rid=001 in progress, pausing Feb 20 11:54:37 fra-corp-auth-02 slapd[4696]: =>do_syncrepl rid=002 Feb 20 11:54:38 fra-corp-auth-02 slapd[4696]: daemon: epoll: listen=8 active_threads=0 tvp=zero Feb 20 11:54:38 fra-corp-auth-02 slapd[4696]: daemon: epoll: listen=9 active_threads=0 tvp=zero Feb 20 11:54:38 fra-corp-auth-02 slapd[4696]: daemon: epoll: listen=10 active_threads=0 tvp=zero Feb 20 11:54:38 fra-corp-auth-02 slapd[4696]: start_refresh: rid=002 a refresh on rid=001 in progress, pausing Feb 20 11:54:39 fra-corp-auth-02 slapd[4696]: =>do_syncrepl rid=002 ....
but .. a ldapsearch on ldap1 .. **still works** :-/
on ldap1 .. log is silent, except from my ldapsearch and ... I have to kill -9 slapd on ldap1 again and start ..
I have no clue .. what else I can do .....
any hints ?
cu denny
If you'd have to use "kill -9", something is really broken IMHO. Did you try to attach strace to the process to see what it does? (strace -p <PID>)
Kind regards, Ulrich Windl
-----Original Message----- From: linuxmail@4lin.net linuxmail@4lin.net Sent: Thursday, February 20, 2025 12:01 PM To: openldap-technical@openldap.org Subject: [EXT] Debian Bookworm: Issues with stucking / hanging slapd process 2.5, while add / modify entries (master-master replication)
Hello,
we fighting since upgrade from Buster to Bookworm with smaller and bigger issues on our OpenLDAP. We use WebADM as IDM (Rcdevs) and this is using OpenLDAP as backend. Since a long while on Bookworm, we have the issues, that slapd stucks on operations, like on adding entries. For example adding more than 1 CN entry to an existing OU. The only way to get all working is again, to stop slapd, but systemctl stop slapd doesn't work, you have to use kill -9 .. and that, pretty often.
So, I hoped, to get it working again, I cloned the VMs; cutted the (normal) network and used a localhost bridge, so that both can see each others, without issues. Then I've created a backup (slapcat); deleted the db and slapd.d/cn=config ... and restored on both the DB. This part worked without issues .. but:
cat /home/foo/sudo_single.ldif dn: cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=example, dc=local objectclass: sudorole objectclass: top cn: jochoa_fra_dev_bookworm_02 sudorunasuser: ALL sudooption: !authenticate sudocommand: /bin/su sudohost: fra-dev-bookworm-02.example.local sudouser: jochoa@example.local ldapadd -ZZ -c -x -D 'cn=webadmin,ou=Accounts,dc=example,dc=local' -W -H ldap://fra-corp-auth-01.example.com:389 -f /home/foo/sudo_single.ldif -vv ldap_initialize( ldap://fra-corp-auth-01.example.com:389/??base ) Enter LDAP Password: add objectclass: sudorole top add cn: jochoa_fra_dev_bookworm_02 add sudorunasuser: ALL add sudooption: !authenticate add sudocommand: /bin/su add sudohost: fra-dev-bookworm-02.example.local add sudouser: jochoa@example.local adding new entry "cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=exampl e,dc=local"
and then .. it just stucks, till I break with CTRL +C
The same happens via ApacheDirectory or using WebADM Gui ... sometimes it works .. but often not.
eb 20 11:20:22 fra-corp-auth-01 slapd[710]: daemon: activity on 1 descriptor Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: daemon: activity on: Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: 22r Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: daemon: read active on 22 Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: daemon: epoll: listen=8 active_threads=0 tvp=zero Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: daemon: epoll: listen=9 active_threads=0 tvp=zero Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: daemon: epoll: listen=10 active_threads=0 tvp=zero Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: connection_get(22) Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: connection_get(22): got connid=1041 Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: connection_read(22): checking for input on id=1041 Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: op tag 0x68, time 1740046822 Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: conn=1041 op=1 do_add Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: conn=1041 op=1 do_add: dn (cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=example ,dc=local) Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: >>> dnPrettyNormal: <cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=exampl e,dc=local> Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: <<< dnPrettyNormal: <cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=exampl e,dc=local>, <cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=exampl e,dc=local> Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: conn=1041 op=1 ADD dn="cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=exa mple,dc=local" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: => mdb_entry_get: ndn: "cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=exampl e,dc=local" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: => mdb_entry_get: oc: "(null)", at: "(null)" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: mdb_dn2entry("cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudo ers,dc=example,dc=local") Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: => mdb_dn2id("cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers ,dc=example,dc=local") Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: <= mdb_dn2id: get failed: MDB_NOTFOUND: No matching key/data pair found (-30798) Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: => mdb_entry_get: cannot find entry: "cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=exampl e,dc=local" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: mdb_entry_get: rc=32 Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: ==> mdb_add: cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=example, dc=local Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: oc_check_required entry (cn=jochoa_fra_dev_bookworm_02,ou=user_rules,ou=sudoers,dc=example ,dc=local), objectClass "sudoRole" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: oc_check_allowed type "objectClass" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: oc_check_allowed type "cn" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: oc_check_allowed type "sudoRunAsUser" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: oc_check_allowed type "sudoOption" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: oc_check_allowed type "sudoCommand" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: oc_check_allowed type "sudoHost" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: oc_check_allowed type "sudoUser" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: oc_check_allowed type "structuralObjectClass" Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: daemon: activity on 1 descriptor Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: daemon: activity on: Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: Feb 20 11:20:22 fra-corp-auth-01 slapd[710]: daemon: epoll: listen=8 active_threads=0 tvp=zero ``` If I try to stop the slapd om ldap1: ``` Feb 20 11:23:10 fra-corp-auth-01 slapd[710]: conn=1001 fd=20 closed (slapd shutdown) Feb 20 11:23:10 fra-corp-auth-01 slapd[710]: connection_closing: readying conn=1041 sd=22 for close Feb 20 11:23:10 fra-corp-auth-01 slapd[710]: connection_close: deferring conn=1041 sd=22 Feb 20 11:23:10 fra-corp-auth-01 slapd[710]: connection_closing: readying conn=1011 sd=23 for close Feb 20 11:23:10 fra-corp-auth-01 slapd[710]: connection_close: deferring conn=1011 sd=23 Feb 20 11:23:10 fra-corp-auth-01 slapd[710]: slapd shutdown: waiting for 4 operations/tasks to finish ``` strace shows: ``` futex(0x7f3e9a9ff990, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 720, NULL, FUTEX_BITSET_MATCH_ANY ``` So, if I stop all .. start slapd again .. all seems fine .. * ldap2 ``` Feb 20 11:49:32 fra-corp-auth-02 slapd[686]: conn=1209 op=2 syncprov_op_search: registered persistent search Feb 20 11:49:32 fra-corp-auth-02 slapd[686]: conn=1209 op=2 syncprov_op_search: no change, skipping log replay Feb 20 11:49:32 fra-corp-auth-02 slapd[686]: conn=1209 op=2 syncprov_op_search: nothing changed, finishing up initial search early Feb 20 11:49:32 fra-corp-auth-02 slapd[686]: conn=1209 op=2 syncprov_sendinfo: refreshDelete cookie= Feb 20 11:49:32 fra-corp-auth-02 slapd[686]: conn=1209 op=2 syncprov_search_response: detaching op ``` then I again try to use ldapadd .. and I see still: * ldap2 ``` ... Feb 20 11:54:35 fra-corp-auth-02 slapd[4696]: =>do_syncrepl rid=002 Feb 20 11:54:36 fra-corp-auth-02 slapd[4696]: daemon: epoll: listen=8 active_threads=0 tvp=zero Feb 20 11:54:36 fra-corp-auth-02 slapd[4696]: daemon: epoll: listen=9 active_threads=0 tvp=zero Feb 20 11:54:36 fra-corp-auth-02 slapd[4696]: daemon: epoll: listen=10 active_threads=0 tvp=zero Feb 20 11:54:36 fra-corp-auth-02 slapd[4696]: start_refresh: rid=002 a refresh on rid=001 in progress, pausing Feb 20 11:54:37 fra-corp-auth-02 slapd[4696]: =>do_syncrepl rid=002 Feb 20 11:54:38 fra-corp-auth-02 slapd[4696]: daemon: epoll: listen=8 active_threads=0 tvp=zero Feb 20 11:54:38 fra-corp-auth-02 slapd[4696]: daemon: epoll: listen=9 active_threads=0 tvp=zero Feb 20 11:54:38 fra-corp-auth-02 slapd[4696]: daemon: epoll: listen=10 active_threads=0 tvp=zero Feb 20 11:54:38 fra-corp-auth-02 slapd[4696]: start_refresh: rid=002 a refresh on rid=001 in progress, pausing Feb 20 11:54:39 fra-corp-auth-02 slapd[4696]: =>do_syncrepl rid=002 .... but .. a ldapsearch on ldap1 .. **still works** :-/ on ldap1 .. log is silent, except from my ldapsearch and ... I have to kill -9 slapd on ldap1 again and start .. I have no clue .. what else I can do ..... any hints ? cu denny
openldap-technical@openldap.org