Re: slow memberOf queries in 2.5 with dynlist overlay

15 Feb 2022

      On 2/15/2022 1:57 AM, Ondřej Kuzník wrote:
...

your DB is just over the size of available RAM by itself

Yes, but that size includes not just the data itself, but all of the 
indexes as well, right?
...

after a while using the system, other processes (and slapd) will carve
 out a fair amount of it that the system will be unwilling/unable to
 page out

Yes. But that is not currently the case. Here is a slapd process on one 
of our nodes that has been up about a week and a half:
ldap        1207       1  9 Feb04 ?        1-01:46:47 
/opt/symas/lib/slapd -d 0 -h ldap:/// ldaps:/// ldapi:/// -u ldap -g ldap
It's resident set is a bit less than a gigabyte:
PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ 
COMMAND
1207 ldap      20   0 8530708 954688 829836 S  28.1  47.8   1546:40 
slapd
While unused (ie wasted) memory is only 82M, the amount of memory in use 
by buffer/cache that the system would be willing to give up at any time 
is more than a gigabyte:
total        used        free      shared  buff/cache 
available
Mem:           1949         413          82           0        1453 
    1382
Swap:          1023         295         728
When the problem occurs, there isn't a memory shortage. There is still 
free memory. Nothing is getting paged in or out, the only IO is 
application read, not system swap.
...

if, to answer that query, you need to crawl a large part of the DB,
 the OS will have to page that part into memory, at the beginning,
 there is enough RAM to do it all just once, later, you've reached a
 threshold and it needs to page bits in and then drop them again to
 fetch others you develop these symptoms - lots or read I/O and a delay
 in processing

Intuitively that does sound like a good description of the problem I'm 
having. But the only thing that takes a long time is returning the 
memberOf attribute. When queries requesting that are taking more than 30 
seconds or even minutes to respond, all other queries remain 
instantaneous. It seems unlikely that under memory pressure the only 
queries that would end up having to page out stuff and be degraded would 
be those? Every other query just happens to have what it needs still in 
memory?
...
Figure out what is involved in that search and see if you can tweak
It's not a very complicated query:
# ldapsearch -x -H ldapi:/// uid=henson memberOf
[...]
dn: uid=henson,ou=user,dc=cpp,dc=edu
memberOf: uid=idm,ou=group,dc=cpp,dc=edu
memberOf: uid=iit,ou=group,dc=cpp,dc=edu
If I understand correctly, this just needs to access the index on uid to 
find my entry, and then the dynlist module presumably does something 
like this:
# ldapsearch -x -H ldapi:/// member=uid=henson,ou=user,dc=cpp,dc=edu dn
[...]
# cppnet, group, cpp.edu
dn: uid=cppnet,ou=group,dc=cpp,dc=edu
# cppnet-admin, group, cpp.edu
dn: uid=cppnet-admin,ou=group,dc=cpp,dc=edu
this just needs to access the index on member to find all of the group 
objects, which in my case is 36.
So it only needs to have two indexes and 37 objects in memory to perform 
quickly, right?
When performance on memberOf queries is degraded, this takes more than 
30 seconds to run. Every single time. I could run it 20 times in a row 
and it always takes more than 30 seconds. If it was a memory issue, you 
would think that at least some of the queries would get lucky and the 
pages needed would be in memory, given they had just been accessed 
moments before?
I can certainly just throw memory at it and hope the problem goes away. 
But based on the observations when it occurs it does not feel like just 
a memory problem. The last time it happened I pulled the node out of the 
load balancer so nothing else was poking at it and the test query was 
still taking more than 30 seconds.
I'm going to bump the production nodes up to 4G, which should be more 
than enough to run the OS and always have the entire database plus all 
indexes in memory. I will keep my fingers crossed this problem just goes 
away, but if it doesn't, what else can I do when it occurs to help track 
it down?
Thanks much…

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

Re: slow memberOf queries in 2.5 with dynlist overlay