Help debugging slave slapd issues

List overview All Threads
Download

newer

older

Used rewriteMap with different uri

Permissive Modify and mdb

BECOT Jérôme

25 Mar 2024 25 Mar '24

7:49 a.m.

Hello,

On all different OpenLDAP 2.4 and 2.5 slaves of 2.4 servers, we see a lot of deferring errors: slapd[37277]: connection_input: conn=32974 deferring operation: too many executing or slapd[37277]: connection_input: conn=32974 deferring operation: pending operations

Can you give any hints about way to debug what causes these messages ? (which log level should I aim for and witch system settings should I check)

I can provide more details then.

Regards Jerome

Attachments:

attachment.html (text/html — 2.7 KB)

Show replies by date

Quanah Gibson-Mount

25 Mar 25 Mar

9:54 a.m.

--On Monday, March 25, 2024 3:49 PM +0000 BECOT Jérôme jbecot@itsgroup.com wrote:

...

Hello,

On all different OpenLDAP 2.4 and 2.5 slaves of 2.4 servers, we see a lot of deferring errors: slapd[37277]: connection_input: conn=32974 deferring operation: too many executing or slapd[37277]: connection_input: conn=32974 deferring operation: pending operations

Those aren't errors.

--Quanah

Christopher Paul

10:06 a.m.

...

Those aren't errors.

But a deferral is not optimal, is it? I think the question "hints about way to debug" is probably a good one. The brute force method to fix this would be to add consumers and spread out the load. Horizontal scaling is the main benefit of a replicated architecture.

Chris Paul | https://www.rexconsulting.net

Quanah Gibson-Mount

10:44 a.m.

--On Monday, March 25, 2024 6:06 PM +0000 Christopher Paul chris.paul@rexconsulting.net wrote:

...

...
Those aren't errors.

But a deferral is not optimal, is it? I think the question "hints about way to debug" is probably a good one. The brute force method to fix this would be to add consumers and spread out the load. Horizontal scaling is the main benefit of a replicated architecture.

Deferrals are common, they are not necessarily indicative of an issue, and without more detail there's no way to determine there is an issue that needs to be addressed or not.

--Quanah

Howard Chu

12:52 p.m.

Quanah Gibson-Mount wrote:

...

--On Monday, March 25, 2024 6:06 PM +0000 Christopher Paul chris.paul@rexconsulting.net wrote:

...
...
Those aren't errors.

But a deferral is not optimal, is it? I think the question "hints about way to debug" is probably a good one. The brute force method to fix this would be to add consumers and spread out the load. Horizontal scaling is the main benefit of a replicated architecture.

...

...
...
slapd[37277]: connection_input: conn=32974 deferring operation: too many executing

...

Deferrals are common, they are not necessarily indicative of an issue, and without more detail there's no way to determine there is an issue that needs to be addressed or not.

Yes, they're common, and these are caused by a client sending too many operations over a connection without waiting for them to complete. In other words, a poorly written client.

Simply adding more replicas does nothing to address this, you need a load balancer that spreads all client queries out, even when they're all coming in from a single connection.

Better yet is to identify the client and fix it.

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

BECOT Jérôme

27 Mar 27 Mar

1:40 a.m.

Thank you for the help. We will look at the clients. I fear sssd would be the culprit, but we have to investigate first. ________________________________ De : Howard Chu hyc@symas.com Envoyé : lundi 25 mars 2024 20:52 À : Quanah Gibson-Mount quanah@fast-mail.org; Christopher Paul chris.paul@rexconsulting.net; BECOT Jérôme jbecot@itsgroup.com; openldap-technical openldap-technical@openldap.org Objet : Re: Help debugging slave slapd issues

[Vous ne recevez pas souvent de courriers de hyc@symas.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]

ATTENTION : Cet e-mail provient de l'extérieur de l'organisation. Ne cliquez pas sur les liens et n'ouvrez pas les pièces jointes à moins que vous ne reconnaissiez l'expéditeur et que vous sachiez que le contenu est sûr.

Quanah Gibson-Mount wrote:

...

--On Monday, March 25, 2024 6:06 PM +0000 Christopher Paul chris.paul@rexconsulting.net wrote:

...
...
Those aren't errors.

But a deferral is not optimal, is it? I think the question "hints about way to debug" is probably a good one. The brute force method to fix this would be to add consumers and spread out the load. Horizontal scaling is the main benefit of a replicated architecture.

...

...
...
slapd[37277]: connection_input: conn=32974 deferring operation: too many executing

...

Deferrals are common, they are not necessarily indicative of an issue, and without more detail there's no way to determine there is an issue that needs to be addressed or not.

Yes, they're common, and these are caused by a client sending too many operations over a connection without waiting for them to complete. In other words, a poorly written client.

Simply adding more replicas does nothing to address this, you need a load balancer that spreads all client queries out, even when they're all coming in from a single connection.

Better yet is to identify the client and fix it.

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Howard Chu

7:07 a.m.

BECOT Jérôme wrote:

...

Thank you for the help. We will look at the clients. I fear sssd would be the culprit, but we have to investigate first.

There's really nothing that needs to be done here. The deferred operations will eventually get processed.

...

*De :* Howard Chu hyc@symas.com *Envoyé :* lundi 25 mars 2024 20:52 *À :* Quanah Gibson-Mount quanah@fast-mail.org; Christopher Paul chris.paul@rexconsulting.net; BECOT Jérôme jbecot@itsgroup.com; openldap-technical openldap-technical@openldap.org *Objet :* Re: Help debugging slave slapd issues [Vous ne recevez pas souvent de courriers de hyc@symas.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]

ATTENTION : Cet e-mail provient de l'extérieur de l'organisation. Ne cliquez pas sur les liens et n'ouvrez pas les pièces jointes à moins que vous ne reconnaissiez l'expéditeur et que vous sachiez que le contenu est sûr.

Quanah Gibson-Mount wrote:

...
--On Monday, March 25, 2024 6:06 PM +0000 Christopher Paul chris.paul@rexconsulting.net wrote:

...
...
Those aren't errors.

But a deferral is not optimal, is it? I think the question "hints about way to debug" is probably a good one. The brute force method to fix this would be to add consumers and spread out the load. Horizontal scaling is the main benefit of a replicated architecture.

...
...
...
slapd[37277]: connection_input: conn=32974 deferring operation: too many executing

...
Deferrals are common, they are not necessarily indicative of an issue, and without more detail there's no way to determine there is an issue that needs to be addressed or not.

Yes, they're common, and these are caused by a client sending too many operations over a connection without waiting for them to complete. In other words, a poorly written client.

Simply adding more replicas does nothing to address this, you need a load balancer that spreads all client queries out, even when they're all coming in from a single connection.

Better yet is to identify the client and fix it.

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Frédéric Goudal

7:51 a.m.

Hello,

...

Le 25 mars 2024 à 20:52, Howard Chu hyc@symas.com a écrit :

Simply adding more replicas does nothing to address this, you need a load balancer that spreads all client queries out, even when they're all coming in from a single connection.

The load balancer is lloadd isn’t ?

Thanks.

f.g.

— Frédéric Goudal Ingénieur Système, DSI Bordeaux-INP +33 556 84 23 11

chris.paul＠rexconsulting.net

30 Mar 30 Mar

8:36 a.m.

On 3/25/24 12:52 PM, Howard Chu hyc@symas.com wrote:

...

Yes, they're common, and these are caused by a client sending too many operations over a connection without waiting for them to complete. In other words, a poorly written client.

Simply adding more replicas does nothing to address this, you need a load balancer that spreads all client queries out, even when they're all coming in from a single connection.

Better yet is to identify the client and fix it.

I won't disagree with Howard, who knows a lot more than I do about OpenLDAP.

But I do want to add that if you do have a load balancer, and you do see these, then check if your load balancer is using SNAT to manage client connections. Usually load balancers do use SNAT.

In the case you see these errors and you are using load balancers that SNAT client IPs, then adding replicas is a good fix.

Chris Paul | https://www.rexconsulting.net

BECOT Jérôme

4 Apr 4 Apr

1:13 a.m.

We don't have load balancers yet, but we have cross site replicas that suffers many client reconnecting when one side is failing and are considering adding both replicas and lb in front.

Thanks for the point. ________________________________ De : chris.paul@rexconsulting.net chris.paul@rexconsulting.net Envoyé : samedi 30 mars 2024 16:36 À : openldap-technical openldap-technical@openldap.org Objet : Re: Help debugging slave slapd issues

[Vous ne recevez pas souvent de courriers de chris.paul@rexconsulting.net. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]

On 3/25/24 12:52 PM, Howard Chu hyc@symas.com wrote:

...

Yes, they're common, and these are caused by a client sending too many operations over a connection without waiting for them to complete. In other words, a poorly written client.

Simply adding more replicas does nothing to address this, you need a load balancer that spreads all client queries out, even when they're all coming in from a single connection.

Better yet is to identify the client and fix it.

I won't disagree with Howard, who knows a lot more than I do about OpenLDAP.

But I do want to add that if you do have a load balancer, and you do see these, then check if your load balancer is using SNAT to manage client connections. Usually load balancers do use SNAT.

In the case you see these errors and you are using load balancers that SNAT client IPs, then adding replicas is a good fix.

Chris Paul | https://www.rexconsulting.net

458

Age (days ago)

468

Last active (days ago)

openldap-technical@openldap.org

9 comments

6 participants

tags (0)

participants (6)

BECOT Jérôme
chris.paul＠rexconsulting.net
Christopher Paul
Frédéric Goudal
Howard Chu
Quanah Gibson-Mount