Search performance pretty bad when result from index > 1

List overview All Threads
Download

newer

older

Adding syncprov overlay to...

openldap 2.5: core dump in...

Norbert

7 Apr 2025 7 Apr '25

5:04 a.m.

Hi,

Used version 2.5.19.1 (ltb)

We have a LDAP with about 4.6 million entries and an indexed attribute which occurs around 3.9 million times. We typicall filter for that attribute with a specific value (eq). Which is typicall very fast and no problems. As soon as the same value used twice the execution time for that filter is becoming really slow even when additional criteria of the filter limits the result to exact 1 entry. Search time is at ~5% for single entry results compared to potential 2 entry results.

Some more details how this was determined: 1) enable "stats" logging on production server for 5 minutes. 2) collect the slowest ~1200 from several thousand searches within the 5 minutes from the log 3) create a separate ldap server with exact same data and configuration (imported with slapadd) 4) use a script running locally on the extra server which executes the 1200 filters one after the other and measure complete execution time of script

With production data I measure around 11s for the ~1200 searches. For all these searches one attribute in the filter could have 2 hits, but it is actually limited to 1 hit because of following filter "(&(objectClass=value)(almost_uniqe_attr=value)(another_attr=*))" Means searching with only "almost_uniqe_attr=value" as filter it would return 2 results, but objectClass and another_attr limit it to exact 1 entry.

When I now remove the second entry from the ldap server for these exact ~1200 filters the script run time will be ~0.5s . If re-add those ~1200 entries the runtime will be around 5s (and with a complete recreate of the db it will be 11s again.)

Limiting the search scope by using a more specific base dn for the search does not change anything in regards to the execution time.

So the question is: 1) can I change anything on the server side to speed up the execution time of these searches?

Regards, Norbert

Show replies by date

Howard Chu

7 Apr 7 Apr

5:19 a.m.

Norbert wrote:

...

Hi,

Used version 2.5.19.1 (ltb)

We have a LDAP with about 4.6 million entries and an indexed attribute which occurs around 3.9 million times. We typicall filter for that attribute with a specific value (eq). Which is typicall very fast and no problems. As soon as the same value used twice the execution time for that filter is becoming really slow even when additional criteria of the filter limits the result to exact 1 entry. Search time is at ~5% for single entry results compared to potential 2 entry results.

Some more details how this was determined:

enable "stats" logging on production server for 5 minutes.

collect the slowest ~1200 from several thousand searches within the 5 minutes from the log

create a separate ldap server with exact same data and configuration (imported with slapadd)

use a script running locally on the extra server which executes the 1200 filters one after the other and measure complete execution time of script

With production data I measure around 11s for the ~1200 searches. For all these searches one attribute in the filter could have 2 hits, but it is actually limited to 1 hit because of following filter "(&(objectClass=value)(almost_uniqe_attr=value)(another_attr=*))" Means searching with only "almost_uniqe_attr=value" as filter it would return 2 results, but objectClass and another_attr limit it to exact 1 entry.

When I now remove the second entry from the ldap server for these exact ~1200 filters the script run time will be ~0.5s . If re-add those ~1200 entries the runtime will be around 5s (and with a complete recreate of the db it will be 11s again.)

Limiting the search scope by using a more specific base dn for the search does not change anything in regards to the execution time.

So the question is: 1) can I change anything on the server side to speed up the execution time of these searches?

How common is 'another_attr'? Is there a presence index on it?

...

Regards, Norbert

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Norbert

6:35 a.m.

Am 07.04.25 um 14:19 schrieb Howard Chu:

...

Norbert wrote:

...
Hi,

Used version 2.5.19.1 (ltb)

We have a LDAP with about 4.6 million entries and an indexed attribute which occurs around 3.9 million times. We typicall filter for that attribute with a specific value (eq). Which is typicall very fast and no problems. As soon as the same value used twice the execution time for that filter is becoming really slow even when additional criteria of the filter limits the result to exact 1 entry. Search time is at ~5% for single entry results compared to potential 2 entry results.

Some more details how this was determined:

enable "stats" logging on production server for 5 minutes.

collect the slowest ~1200 from several thousand searches within the 5 minutes from the log

create a separate ldap server with exact same data and configuration (imported with slapadd)

use a script running locally on the extra server which executes the 1200 filters one after the other and measure complete execution time of script

With production data I measure around 11s for the ~1200 searches. For all these searches one attribute in the filter could have 2 hits, but it is actually limited to 1 hit because of following filter "(&(objectClass=value)(almost_uniqe_attr=value)(another_attr=*))" Means searching with only "almost_uniqe_attr=value" as filter it would return 2 results, but objectClass and another_attr limit it to exact 1 entry.

When I now remove the second entry from the ldap server for these exact ~1200 filters the script run time will be ~0.5s . If re-add those ~1200 entries the runtime will be around 5s (and with a complete recreate of the db it will be 11s again.)

Limiting the search scope by using a more specific base dn for the search does not change anything in regards to the execution time.

So the question is: 1) can I change anything on the server side to speed up the execution time of these searches?

How common is 'another_attr'? Is there a presence index on it?

another_attr is the most occuring attribute in the server, typically values occur once but in this particular case it is the majority that 2 entries are referenced with this attribute. The index for this attribute is configured as "eq,sub".

Norbert

6:56 a.m.

Am 07.04.25 um 15:35 schrieb Norbert:

...

Am 07.04.25 um 14:19 schrieb Howard Chu:

...
Norbert wrote:

...
Hi,

Used version 2.5.19.1 (ltb)

We have a LDAP with about 4.6 million entries and an indexed attribute which occurs around 3.9 million times. We typicall filter for that attribute with a specific value (eq). Which is typicall very fast and no problems. As soon as the same value used twice the execution time for that filter is becoming really slow even when additional criteria of the filter limits the result to exact 1 entry. Search time is at ~5% for single entry results compared to potential 2 entry results.

Some more details how this was determined:

enable "stats" logging on production server for 5 minutes.

collect the slowest ~1200 from several thousand searches within the 5 minutes from the log

create a separate ldap server with exact same data and configuration (imported with slapadd)

use a script running locally on the extra server which executes the 1200 filters one after the other and measure complete execution time of script

With production data I measure around 11s for the ~1200 searches. For all these searches one attribute in the filter could have 2 hits, but it is actually limited to 1 hit because of following filter "(&(objectClass=value)(almost_uniqe_attr=value)(another_attr=*))" Means searching with only "almost_uniqe_attr=value" as filter it would return 2 results, but objectClass and another_attr limit it to exact 1 entry.

When I now remove the second entry from the ldap server for these exact ~1200 filters the script run time will be ~0.5s . If re-add those ~1200 entries the runtime will be around 5s (and with a complete recreate of the db it will be 11s again.)

Limiting the search scope by using a more specific base dn for the search does not change anything in regards to the execution time.

So the question is: 1) can I change anything on the server side to speed up the execution time of these searches?

How common is 'another_attr'? Is there a presence index on it?

another_attr is the most occuring attribute in the server, typically values occur once but in this particular case it is the majority that 2 entries are referenced with this attribute. The index for this attribute is configured as "eq,sub".

sorry. I got confused with my arbitrary names. Each entry of interest has another_attr set. But when looking at the search performance when removing (another_attr=*) from the test filters, it does not have any impact regards to performance. With or without it the run time is the same and it returns 1 entry because objectClass matters in these cases. another_attr has eq but not pres. Many entries have actually the same value in this case.

Regards, Norbert

Ondřej Kuzník

8 Apr 8 Apr

3:32 a.m.

On Mon, Apr 07, 2025 at 03:56:17PM +0200, Norbert wrote:

...

Am 07.04.25 um 15:35 schrieb Norbert:

...
Am 07.04.25 um 14:19 schrieb Howard Chu:

...
Norbert wrote:

...
Hi,

Used version 2.5.19.1 (ltb)

We have a LDAP with about 4.6 million entries and an indexed attribute which occurs around 3.9 million times. We typicall filter for that attribute with a specific value (eq). Which is typicall very fast and no problems. As soon as the same value used twice the execution time for that filter is becoming really slow even when additional criteria of the filter limits the result to exact 1 entry. Search time is at ~5% for single entry results compared to potential 2 entry results.

Some more details how this was determined:

enable "stats" logging on production server for 5 minutes.

collect the slowest ~1200 from several thousand searches within the 5 minutes from the log

create a separate ldap server with exact same data and configuration (imported with slapadd)

use a script running locally on the extra server which executes the 1200 filters one after the other and measure complete execution time of script

With production data I measure around 11s for the ~1200 searches. For all these searches one attribute in the filter could have 2 hits, but it is actually limited to 1 hit because of following filter "(&(objectClass=value)(almost_uniqe_attr=value)(another_attr=*))" Means searching with only "almost_uniqe_attr=value" as filter it would return 2 results, but objectClass and another_attr limit it to exact 1 entry.

When I now remove the second entry from the ldap server for these exact ~1200 filters the script run time will be ~0.5s . If re-add those ~1200 entries the runtime will be around 5s (and with a complete recreate of the db it will be 11s again.)

Limiting the search scope by using a more specific base dn for the search does not change anything in regards to the execution time.

So the question is: 1) can I change anything on the server side to speed up the execution time of these searches?

How common is 'another_attr'? Is there a presence index on it?

another_attr is the most occuring attribute in the server, typically values occur once but in this particular case it is the majority that 2 entries are referenced with this attribute. The index for this attribute is configured as "eq,sub".

sorry. I got confused with my arbitrary names. Each entry of interest has another_attr set. But when looking at the search performance when removing (another_attr=*) from the test filters, it does not have any impact regards to performance. With or without it the run time is the same and it returns 1 entry because objectClass matters in these cases. another_attr has eq but not pres. Many entries have actually the same value in this case.

Hi Norbert, just a thought:

It looks like you also have a "sub"string index on that attribute, all indexes for a given attribute exist in the same namespace and a substring index generates a *lot* of items. So you'll get false positives competing for slapd's attention - have you enabled 64bit hashes already ("index_hash64 on")?

Should help with the contention if you haven't yet.

Regards,

-- Ondřej Kuzník Senior Software Engineer Symas Corporation http://www.symas.com Packaged, certified, and supported LDAP solutions powered by OpenLDAP

Norbert

1:56 p.m.

Hi,

Am 08.04.25 um 12:32 schrieb Ondřej Kuzník:

...

just a thought:

It looks like you also have a "sub"string index on that attribute, all indexes for a given attribute exist in the same namespace and a substring index generates a *lot* of items. So you'll get false positives competing for slapd's attention - have you enabled 64bit hashes already ("index_hash64 on")?

Should help with the contention if you haven't yet.

I did two further tests: 1) olcIndexHash64: TRUE 2) olcIndexHash64: TRUE and only keeping eq for almost_uniqe_attr

in both cases config and data was wiped and re-created with slapadd I confirmed that keysize is now 64bit in the index.

mdb_stat for the index with eq,sub and 32bit index keys from a running server Status of almost_uniqe_attr Tree depth: 3 Branch pages: 256 Leaf pages: 47269 Overflow pages: 0 Entries: 47486472

mdb_stat for index with eq only and 64bit index keys after fresh import Status of almost_uniqe_attr Tree depth: 4 Branch pages: 261 Leaf pages: 41908 Overflow pages: 0 Entries: 3931262

Unfortunately there was no change in runtime. The 1200 queries still take around 11s, might be even a tiny bit slower with 12s.

Regards, Norbert

Norbert

14 Apr 14 Apr

5:58 a.m.

Am 08.04.25 um 22:56 schrieb Norbert:

...

Hi,

Am 08.04.25 um 12:32 schrieb Ondřej Kuzník:

...
just a thought:

It looks like you also have a "sub"string index on that attribute, all indexes for a given attribute exist in the same namespace and a substring index generates a *lot* of items. So you'll get false positives competing for slapd's attention - have you enabled 64bit hashes already ("index_hash64 on")?

Should help with the contention if you haven't yet.

I did two further tests:

olcIndexHash64: TRUE

olcIndexHash64: TRUE and only keeping eq for almost_uniqe_attr

in both cases config and data was wiped and re-created with slapadd I confirmed that keysize is now 64bit in the index.

mdb_stat for the index with eq,sub and 32bit index keys from a running server Status of almost_uniqe_attr Tree depth: 3 Branch pages: 256 Leaf pages: 47269 Overflow pages: 0 Entries: 47486472

mdb_stat for index with eq only and 64bit index keys after fresh import Status of almost_uniqe_attr Tree depth: 4 Branch pages: 261 Leaf pages: 41908 Overflow pages: 0 Entries: 3931262

Unfortunately there was no change in runtime. The 1200 queries still take around 11s, might be even a tiny bit slower with 12s.

When running those 1200 filters and recording activity with perf in parallel I get at the top

# Samples: 45K of event 'cpu-clock:pppH' # Event count (approx.): 11330500000 # # Children Self Command Shared Object Symbol # ........ ........ ....... ....................... ............................................. # 41.12% 40.82% slapd back_mdb-2.5.so.0.1.14 [.] mdb_idl_next | ---mdb_idl_next

31.71% 31.55% slapd back_mdb-2.5.so.0.1.14 [.] mdb_idl_intersection | ---mdb_idl_intersection

24.25% 24.11% slapd back_mdb-2.5.so.0.1.14 [.] mdb_idl_next@plt | ---mdb_idl_next@plt

1.13% 0.00% slapd [kernel.kallsyms] [k] entry_SYSCALL_64_after_hwframe | ---entry_SYSCALL_64_after_hwframe |

And when removing the second entry I get for the same 1200 filters following recorded

# Samples: 420 of event 'cpu-clock:pppH' # Event count (approx.): 105000000 # # Children Self Command Shared Object Symbol # ........ ........ ....... ....................... ....................................... # 41.90% 0.00% slapd [kernel.kallsyms] [k] entry_SYSCALL_64_after_hwframe | ---entry_SYSCALL_64_after_hwframe | |--41.19%--do_syscall_64 | | | |--22.62%--ksys_write

Thanks, Norbert

Howard Chu

15 Apr 15 Apr

9:07 a.m.

Norbert wrote:

...

Am 08.04.25 um 22:56 schrieb Norbert:

...
Hi,

Am 08.04.25 um 12:32 schrieb Ondřej Kuzník:

...
just a thought:

It looks like you also have a "sub"string index on that attribute, all indexes for a given attribute exist in the same namespace and a substring index generates a *lot* of items. So you'll get false positives competing for slapd's attention - have you enabled 64bit hashes already ("index_hash64 on")?

Should help with the contention if you haven't yet.

I did two further tests:

olcIndexHash64: TRUE

olcIndexHash64: TRUE and only keeping eq for almost_uniqe_attr

in both cases config and data was wiped and re-created with slapadd I confirmed that keysize is now 64bit in the index.

mdb_stat for the index with eq,sub and 32bit index keys from a running server Status of almost_uniqe_attr Tree depth: 3 Branch pages: 256 Leaf pages: 47269 Overflow pages: 0 Entries: 47486472

mdb_stat for index with eq only and 64bit index keys after fresh import Status of almost_uniqe_attr Tree depth: 4 Branch pages: 261 Leaf pages: 41908 Overflow pages: 0 Entries: 3931262

Unfortunately there was no change in runtime. The 1200 queries still take around 11s, might be even a tiny bit slower with 12s.

When running those 1200 filters and recording activity with perf in parallel I get at the top

The best way to diagnose this is to run a single search while gdb'ing slapd and check what two IDLs are being operated on in mdb_idl_intersection. Considering that 24% of CPU time is in the mdb_idl_next plt, you're seeing a ton of overhead simply from this backend being built as a dynamic module. You might be able to eliminate this overhead by adding -Bsymbolic to the linker invocation for back-mdb.

...

# Samples: 45K of event 'cpu-clock:pppH' # Event count (approx.): 11330500000 # # Children      Self Command Shared Object            Symbol # ........ ........ ....... ....................... ............................................. #     41.12%    40.82% slapd    back_mdb-2.5.so.0.1.14   [.] mdb_idl_next             |             ---mdb_idl_next

31.71%    31.55% slapd    back_mdb-2.5.so.0.1.14   [.] mdb_idl_intersection             |             ---mdb_idl_intersection

24.25%    24.11% slapd    back_mdb-2.5.so.0.1.14   [.] mdb_idl_next@plt             |             ---mdb_idl_next@plt

1.13%     0.00% slapd    [kernel.kallsyms]        [k] entry_SYSCALL_64_after_hwframe             |             ---entry_SYSCALL_64_after_hwframe                |

And when removing the second entry I get for the same 1200 filters following recorded

# Samples: 420 of event 'cpu-clock:pppH' # Event count (approx.): 105000000 # # Children      Self Command Shared Object            Symbol # ........ ........ ....... ....................... ....................................... #     41.90%     0.00% slapd    [kernel.kallsyms]        [k] entry_SYSCALL_64_after_hwframe             |             ---entry_SYSCALL_64_after_hwframe                |                |--41.19%--do_syscall_64                |          |                |          |--22.62%--ksys_write

Thanks, Norbert

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Norbert

17 Apr 17 Apr

2:21 a.m.

Am 15.04.25 um 18:07 schrieb Howard Chu:

...

Norbert wrote:

...
Am 08.04.25 um 22:56 schrieb Norbert:

...
Hi,

Am 08.04.25 um 12:32 schrieb Ondřej Kuzník:

...
just a thought:

It looks like you also have a "sub"string index on that attribute, all indexes for a given attribute exist in the same namespace and a substring index generates a *lot* of items. So you'll get false positives competing for slapd's attention - have you enabled 64bit hashes already ("index_hash64 on")?

Should help with the contention if you haven't yet.

I did two further tests:

olcIndexHash64: TRUE

olcIndexHash64: TRUE and only keeping eq for almost_uniqe_attr

in both cases config and data was wiped and re-created with slapadd I confirmed that keysize is now 64bit in the index.

mdb_stat for the index with eq,sub and 32bit index keys from a running server Status of almost_uniqe_attr Tree depth: 3 Branch pages: 256 Leaf pages: 47269 Overflow pages: 0 Entries: 47486472

mdb_stat for index with eq only and 64bit index keys after fresh import Status of almost_uniqe_attr Tree depth: 4 Branch pages: 261 Leaf pages: 41908 Overflow pages: 0 Entries: 3931262

Unfortunately there was no change in runtime. The 1200 queries still take around 11s, might be even a tiny bit slower with 12s.

When running those 1200 filters and recording activity with perf in parallel I get at the top

The best way to diagnose this is to run a single search while gdb'ing slapd and check what two IDLs are being operated on in mdb_idl_intersection. Considering that 24% of CPU time is in the mdb_idl_next plt, you're seeing a ton of overhead simply from this backend being built as a dynamic module. You might be able to eliminate this overhead by adding -Bsymbolic to the linker invocation for back-mdb.

I managed to gdb it a bit. I put a breakpoint at line 727 in idl.c (2.5.19):

idmin = IDL_MAX( MDB_IDL_FIRST(a), MDB_IDL_FIRST(b) ); idmax = IDL_MIN( MDB_IDL_LAST(a), MDB_IDL_LAST(b) );

...

if ( idmin > idmax ) {

When almost_uniqe_attr has only 1 value then idmin == idmax. But in case of two values for a almost_uniqe_attr key I see that idmin = 1759457 and idmax = 2731413 so they are 971956 apart.

227

Age (days ago)

237

Last active (days ago)

openldap-technical@openldap.org

8 comments

3 participants

tags (0)

participants (3)

Howard Chu
Norbert
Ondřej Kuzník