https://bugs.openldap.org/show_bug.cgi?id=9924
Issue ID: 9924 Summary: Increased/RunAway memory usage slapo-deref Product: OpenLDAP Version: 2.5.13 Hardware: x86_64 OS: Linux Status: UNCONFIRMED Keywords: needs_review Severity: normal Priority: --- Component: overlays Assignee: bugs@openldap.org Reporter: erikdewaard@gmail.com Target Milestone: ---
Created attachment 916 --> https://bugs.openldap.org/attachment.cgi?id=916&action=edit slapd.conf
Increased/RunAway memory usage slapo-deref
Running: 2.5.13
After enabling slapo-deref slapd memory usage increased and growing. I can reproduce this on every consumer with deref enabled.
From: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 173229 ldap 20 0 26.6g 1.0g 941996 S 4.0 0.8 3674:19 slapd
To: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2745810 ldap 20 0 141.5g 115.8g 468940 S 3.0 92.5 312:42.93 slapd
How best to debug this? I should probably recompile to get all symbols for slapd available.
#valgrind.sh valgrind --leak-check=full \ --show-leak-kinds=all \ --extra-debuginfo-path=/usr/lib/debug/usr/lib64/openldap \ --allow-mismatched-debuginfo=yes \ --track-origins=yes \ --error-limit=no \ --verbose \ --log-file=valgrind-out.txt \ /usr/sbin/slapd -F /etc/openldap/slapd.d -u ldap -h "ldap:/// ldaps:/// ldapi:///"
#mleak.sh LD_PRELOAD=/tmp/mleak/mleak.so \ /usr/sbin/slapd -F /etc/openldap/slapd.d -u ldap -h "ldap:/// ldaps:/// ldapi:///"
sent kill -2
#mleak_report.sh ./mdump /usr/sbin/slapd ml.* ./report.sh | more fncdump: Cant open linux-vdso.so.1 Memory leaks (14480 total):
https://bugs.openldap.org/show_bug.cgi?id=9924
erikdewaard@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #916 is|0 |1 obsolete| |
--- Comment #1 from erikdewaard@gmail.com --- Created attachment 917 --> https://bugs.openldap.org/attachment.cgi?id=917&action=edit conf and valgrind output.
https://bugs.openldap.org/show_bug.cgi?id=9924
--- Comment #2 from Ondřej Kuzník ondra@mistotebe.net --- Hi, if you can isolate the traffic leading to the leak, would you mind preparing a small script that reproduces the issue? Your configuration is quite involved so rather than trying to fill in all the missing pieces to be able to run it, I tried to isolate small bits of it but could not identify any leaks for searches I've run (assuming it's a leak inside deref, which we should be able to confirm once you've recompiled).
Thanks, Ondrej
https://bugs.openldap.org/show_bug.cgi?id=9924
--- Comment #3 from Howard Chu hyc@openldap.org --- You need to also use the valgrind option --keep-debuginfo=yes otherwise the symbols for dynamically loaded modules are lost when they're unloaded at slapd shutdown time.
https://bugs.openldap.org/show_bug.cgi?id=9924
Howard Chu hyc@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Ever confirmed|0 |1 Status|UNCONFIRMED |IN_PROGRESS
--- Comment #4 from Howard Chu hyc@openldap.org --- A fix is in https://git.openldap.org/openldap/openldap/-/merge_requests/568 please verify, thanks.
https://bugs.openldap.org/show_bug.cgi?id=9924
--- Comment #5 from Michael Ströder michael@stroeder.com ---
--- Comment #4 from Howard Chu hyc@openldap.org --- A fix is in https://git.openldap.org/openldap/openldap/-/merge_requests/568 please verify, thanks.
I'd like to roll out this as back-port patch for Æ-DIR servers to check whether this also fixes ITS#9365.
Your patch [1] applies cleanly to RE26 and seems to work on my local test systems (with python-ldap0, ae-dir-tool, aehostd and web2ldap using deref control). Anything else I should test before pushing this to production?
[1] https://code.stroeder.com/AE-DIR/debian-openldap-ms/src/branch/main/debian/p...
https://bugs.openldap.org/show_bug.cgi?id=9924
--- Comment #6 from Howard Chu hyc@openldap.org --- (In reply to Michael Ströder from comment #5)
--- Comment #4 from Howard Chu hyc@openldap.org --- A fix is in https://git.openldap.org/openldap/openldap/-/merge_requests/568 please verify, thanks.
I'd like to roll out this as back-port patch for Æ-DIR servers to check whether this also fixes ITS#9365.
Your patch [1] applies cleanly to RE26 and seems to work on my local test systems (with python-ldap0, ae-dir-tool, aehostd and web2ldap using deref control). Anything else I should test before pushing this to production?
Triggering the leak requires first exhausting the per-thread tmpmem allocator, so you need a fairly large search response set with a lot of returned deref values. Have you verified that you can trigger the leak without the patch, and that the leak is gone with the patch in place?
https://bugs.openldap.org/show_bug.cgi?id=9924
--- Comment #7 from Michael Ströder michael@stroeder.com ---
--- Comment #6 from Howard Chu hyc@openldap.org --- Triggering the leak requires first exhausting the per-thread tmpmem allocator, so you need a fairly large search response set with a lot of returned deref values.
What does "fairly large" mean? A few thousand entries? Or a millions?
https://bugs.openldap.org/show_bug.cgi?id=9924
--- Comment #8 from erikdewaard@gmail.com --- (In reply to Howard Chu from comment #4)
A fix is in https://git.openldap.org/openldap/openldap/-/merge_requests/568 please verify, thanks.
Almost an hour in and looking very promising, verified.
#Without Patch PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1539782 ldap 20 0 26.2g 1.4g 729904 S 17.6 1.1 0:49.58 slapd
#With Patch PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3241557 ldap 20 0 25.6g 781948 734952 S 1.3 0.6 0:42.91 slapd
https://bugs.openldap.org/show_bug.cgi?id=9924
--- Comment #9 from Howard Chu hyc@openldap.org --- (In reply to Michael Ströder from comment #7)
--- Comment #6 from Howard Chu hyc@openldap.org --- Triggering the leak requires first exhausting the per-thread tmpmem allocator, so you need a fairly large search response set with a lot of returned deref values.
What does "fairly large" mean? A few thousand entries? Or a millions?
The tmpmem allocator can handle up to 1MB by default, so the size of a single search response entry plus all of the deref values must consume more than 1MB to trigger the leak.
https://bugs.openldap.org/show_bug.cgi?id=9924
--- Comment #10 from Howard Chu hyc@openldap.org --- (In reply to Howard Chu from comment #3)
You need to also use the valgrind option --keep-debuginfo=yes otherwise the symbols for dynamically loaded modules are lost when they're unloaded at slapd shutdown time.
(Reminded me that we worked over 14 years to get this option added to valgrind... https://bugs.kde.org/show_bug.cgi?id=79362 )
https://bugs.openldap.org/show_bug.cgi?id=9924
--- Comment #11 from Michael Ströder michael@stroeder.com --- On 9/30/22 01:52, openldap-its@openldap.org wrote:
https://bugs.openldap.org/show_bug.cgi?id=9924 The tmpmem allocator can handle up to 1MB by default, so the size of a single search response entry plus all of the deref values must consume more than 1MB to trigger the leak.
It seems an Æ-DIR reporting script can trigger this (with 4000 aeUser entries referencing 4000 aePerson entries).
https://bugs.openldap.org/show_bug.cgi?id=9924
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Keywords|needs_review | Target Milestone|--- |2.5.14
https://bugs.openldap.org/show_bug.cgi?id=9924
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://bugs.openldap.org/s | |how_bug.cgi?id=9365
https://bugs.openldap.org/show_bug.cgi?id=9924
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|IN_PROGRESS |RESOLVED Resolution|--- |FIXED
--- Comment #12 from Quanah Gibson-Mount quanah@openldap.org --- head:
• e640ce28 by Howard Chu at 2022-09-29T21:44:25+00:00 ITS#9924 slapo-deref: plug memleak
RE26:
• 40fb9a90 by Howard Chu at 2022-10-03T16:35:10+00:00 ITS#9924 slapo-deref: plug memleak
RE25:
• 1f99099c by Howard Chu at 2022-10-03T16:36:02+00:00 ITS#9924 slapo-deref: plug memleak
https://bugs.openldap.org/show_bug.cgi?id=9924
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Assignee|bugs@openldap.org |hyc@openldap.org
https://bugs.openldap.org/show_bug.cgi?id=9924
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |VERIFIED