https://bugs.openldap.org/show_bug.cgi?id=9197
Bug ID: 9197 Summary: slapd-ldap/slapo-chain hits error 80 after idletimeout Product: OpenLDAP Version: 2.5 Hardware: All OS: All Status: UNCONFIRMED Severity: normal Priority: --- Component: backends Assignee: bugs@openldap.org Reporter: quanah@openldap.org Target Milestone: ---
From a customer:
In order to communicate via the LB managed writable ldap, we have to ensure that an idle connection is periodically refreshed. If we do not, the LB will silently drop the connection after 5 minutes.
Therefore to combat that I set an olcIdleTimeout on the writable server so that the chain cached connections will be removed before the LB timeout hits.
However the slapo-ldap client goes into CLOSE_WAIT state, which causes subsequent ldapmodify updates being brokered by the read only instance to fail with err=80. There appear to be a few bugs filed on this in the past against slapd-ldap, but it's not clear if we may be hitting the same issue, or if this is a new one.
I've also connected the read only instances directly to the writable ldap instances and the CLOSE_WAIT issue persists, so I don't believe the CLOSE_WAIT issue is caused by the LB
These were the other threads I found as I started looking for this problem, these are using the ldap-proxy though I think: https://www.openldap.org/lists/openldap-technical/201301/msg00323.html http://www.openldap.org/lists/openldap-software/201004/msg00060.html https://www.openldap.org/lists/openldap-bugs/200412/msg00029.html
The LB we have seems to be set to forget connections that last over 5 min per the setting, so the 240:10:30 seemed like it should have worked and I just thought it wasn't working because in the man page the text "Only some systems support the customization of these values" is present. however after setting keepalive to 60:10:30 did I maintain a stable connection, so there may be other network settings at play I'm not aware of.
https://bugs.openldap.org/show_bug.cgi?id=9197
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://bugs.openldap.org/s | |how_bug.cgi?id=4420
https://bugs.openldap.org/show_bug.cgi?id=9197
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Version|2.5 |2.4.48 See Also| |https://bugs.openldap.org/s | |how_bug.cgi?id=3217
https://bugs.openldap.org/show_bug.cgi?id=9197
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|--- |2.5.0 Keywords| |OL_2_5_REQ
https://bugs.openldap.org/show_bug.cgi?id=9197
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|2.5.0 |2.5.1
https://bugs.openldap.org/show_bug.cgi?id=9197
--- Comment #1 from Quanah Gibson-Mount quanah@openldap.org --- back-ldap likely is missing a task to close idle connections.
https://bugs.openldap.org/show_bug.cgi?id=9197
--- Comment #2 from tero.saarni@est.tech --- I have submitted merge request https://git.openldap.org/openldap/openldap/-/merge_requests/211
https://bugs.openldap.org/show_bug.cgi?id=9197
--- Comment #3 from tero.saarni@est.tech --- Here is a notice of origin and rights statement for the patch
The attached patch file is derived from OpenLDAP Software. All of the modifications to OpenLDAP Software represented in the following patch(es) were developed by Tero Saarni tero.saarni@est.tech. I have not assigned rights and/or interest in this work to any party.
Ericsson Software Technology AB hereby place the following modifications to OpenLDAP Software (and only these modifications) into the public domain. Hence, these modifications may be freely used and/or redistributed for any purpose with or without attribution and/or other notice.
https://bugs.openldap.org/show_bug.cgi?id=9197
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Keywords| |reviewed Target Milestone|2.5.1 |2.5.3
https://bugs.openldap.org/show_bug.cgi?id=9197
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |RESOLVED Target Milestone|2.5.3 |2.5.2 Keywords|OL_2_5_REQ, reviewed | Resolution|--- |FIXED
--- Comment #4 from Quanah Gibson-Mount quanah@openldap.org --- Commits: • 0eacc4a7 by Tero Saarni at 2021-02-24T22:07:48+00:00 ITS#9197 back-ldap: added task that prunes expired connections
https://bugs.openldap.org/show_bug.cgi?id=9197
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Ever confirmed|0 |1 Resolution|FIXED |--- Status|RESOLVED |CONFIRMED
--- Comment #5 from Quanah Gibson-Mount quanah@openldap.org --- Ever since this went in, we've started getting sporadic test failures of test079, breaking CI/CD.
https://bugs.openldap.org/show_bug.cgi?id=9197
--- Comment #6 from Quanah Gibson-Mount quanah@openldap.org --- Trivially reproducible:
Cleaning up test run directory from this run. Running 17 of 500 iterations running defines.sh Running slapadd to build database for the remote slapd server... Starting remote slapd server on TCP/IP port 9011... Starting slapd proxy on TCP/IP port 9012... Create shared connection towards remote LDAP (time_t now=1614220114 timeout=1614220118) Checking that proxy has created connections towards backend Sleeping until idle-timeout and conn-ttl have passed Checking that proxy has closed expired connections towards the remote LDAP server (time_t now=1614220119) Create private connection towards remote LDAP (time_t now=1614220119 timeout=1614220123) Checking that proxy has created connections towards backend Sleeping until idle-timeout and conn-ttl have passed Checking that proxy has closed expired connections towards the remote LDAP server (time_t now=1614220125) Checking that idle-timeout is reset on activity Create cached connection: idle-timeout timeout starts (time_t now=1614220125, original_timeout=1614220129) Do another search to reset the timeout (time_t now=1614220128, new_timeout=1614220132) Check that connection is still alive due to idle-timeout reset (time_t now=1614220132) Error: LDAP connection to remote LDAP server is not found (1) Failed after 17 of 500 iterations
https://bugs.openldap.org/show_bug.cgi?id=9197
tero.saarni@est.tech changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |tero.saarni@est.tech
--- Comment #7 from tero.saarni@est.tech --- Sorry for the flaky test!
I've improved it and submitted a merge request https://git.openldap.org/openldap/openldap/-/merge_requests/255
https://bugs.openldap.org/show_bug.cgi?id=9197
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |FIXED Status|CONFIRMED |RESOLVED
--- Comment #8 from Quanah Gibson-Mount quanah@openldap.org --- Commits: • 3db2e4a0 by Tero Saarni at 2021-02-25T16:56:55+02:00 ITS#9197 Increase timeouts in test case due to sporadic failures
https://bugs.openldap.org/show_bug.cgi?id=9197
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |VERIFIED