Hello,
On all different OpenLDAP 2.4 and 2.5 slaves of 2.4 servers, we see a lot of deferring errors: slapd[37277]: connection_input: conn=32974 deferring operation: too many executing or slapd[37277]: connection_input: conn=32974 deferring operation: pending operations
Can you give any hints about way to debug what causes these messages ? (which log level should I aim for and witch system settings should I check)
I can provide more details then.
Regards Jerome
--On Monday, March 25, 2024 3:49 PM +0000 BECOT Jérôme jbecot@itsgroup.com wrote:
Hello,
On all different OpenLDAP 2.4 and 2.5 slaves of 2.4 servers, we see a lot of deferring errors: slapd[37277]: connection_input: conn=32974 deferring operation: too many executing or slapd[37277]: connection_input: conn=32974 deferring operation: pending operations
Those aren't errors.
--Quanah
Those aren't errors.
But a deferral is not optimal, is it? I think the question "hints about way to debug" is probably a good one. The brute force method to fix this would be to add consumers and spread out the load. Horizontal scaling is the main benefit of a replicated architecture.
Chris Paul | https://www.rexconsulting.net
--On Monday, March 25, 2024 6:06 PM +0000 Christopher Paul chris.paul@rexconsulting.net wrote:
Those aren't errors.
But a deferral is not optimal, is it? I think the question "hints about way to debug" is probably a good one. The brute force method to fix this would be to add consumers and spread out the load. Horizontal scaling is the main benefit of a replicated architecture.
Deferrals are common, they are not necessarily indicative of an issue, and without more detail there's no way to determine there is an issue that needs to be addressed or not.
--Quanah
Quanah Gibson-Mount wrote:
--On Monday, March 25, 2024 6:06 PM +0000 Christopher Paul chris.paul@rexconsulting.net wrote:
Those aren't errors.
But a deferral is not optimal, is it? I think the question "hints about way to debug" is probably a good one. The brute force method to fix this would be to add consumers and spread out the load. Horizontal scaling is the main benefit of a replicated architecture.
slapd[37277]: connection_input: conn=32974 deferring operation: too many executing
Deferrals are common, they are not necessarily indicative of an issue, and without more detail there's no way to determine there is an issue that needs to be addressed or not.
Yes, they're common, and these are caused by a client sending too many operations over a connection without waiting for them to complete. In other words, a poorly written client.
Simply adding more replicas does nothing to address this, you need a load balancer that spreads all client queries out, even when they're all coming in from a single connection.
Better yet is to identify the client and fix it.
Thank you for the help. We will look at the clients. I fear sssd would be the culprit, but we have to investigate first. ________________________________ De : Howard Chu hyc@symas.com Envoyé : lundi 25 mars 2024 20:52 À : Quanah Gibson-Mount quanah@fast-mail.org; Christopher Paul chris.paul@rexconsulting.net; BECOT Jérôme jbecot@itsgroup.com; openldap-technical openldap-technical@openldap.org Objet : Re: Help debugging slave slapd issues
[Vous ne recevez pas souvent de courriers de hyc@symas.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
ATTENTION : Cet e-mail provient de l'extérieur de l'organisation. Ne cliquez pas sur les liens et n'ouvrez pas les pièces jointes à moins que vous ne reconnaissiez l'expéditeur et que vous sachiez que le contenu est sûr.
Quanah Gibson-Mount wrote:
--On Monday, March 25, 2024 6:06 PM +0000 Christopher Paul chris.paul@rexconsulting.net wrote:
Those aren't errors.
But a deferral is not optimal, is it? I think the question "hints about way to debug" is probably a good one. The brute force method to fix this would be to add consumers and spread out the load. Horizontal scaling is the main benefit of a replicated architecture.
slapd[37277]: connection_input: conn=32974 deferring operation: too many executing
Deferrals are common, they are not necessarily indicative of an issue, and without more detail there's no way to determine there is an issue that needs to be addressed or not.
Yes, they're common, and these are caused by a client sending too many operations over a connection without waiting for them to complete. In other words, a poorly written client.
Simply adding more replicas does nothing to address this, you need a load balancer that spreads all client queries out, even when they're all coming in from a single connection.
Better yet is to identify the client and fix it.
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
BECOT Jérôme wrote:
Thank you for the help. We will look at the clients. I fear sssd would be the culprit, but we have to investigate first.
There's really nothing that needs to be done here. The deferred operations will eventually get processed.
*De :* Howard Chu hyc@symas.com *Envoyé :* lundi 25 mars 2024 20:52 *À :* Quanah Gibson-Mount quanah@fast-mail.org; Christopher Paul chris.paul@rexconsulting.net; BECOT Jérôme jbecot@itsgroup.com; openldap-technical openldap-technical@openldap.org *Objet :* Re: Help debugging slave slapd issues [Vous ne recevez pas souvent de courriers de hyc@symas.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
ATTENTION : Cet e-mail provient de l'extérieur de l'organisation. Ne cliquez pas sur les liens et n'ouvrez pas les pièces jointes à moins que vous ne reconnaissiez l'expéditeur et que vous sachiez que le contenu est sûr.
Quanah Gibson-Mount wrote:
--On Monday, March 25, 2024 6:06 PM +0000 Christopher Paul chris.paul@rexconsulting.net wrote:
Those aren't errors.
But a deferral is not optimal, is it? I think the question "hints about way to debug" is probably a good one. The brute force method to fix this would be to add consumers and spread out the load. Horizontal scaling is the main benefit of a replicated architecture.
slapd[37277]: connection_input: conn=32974 deferring operation: too many executing
Deferrals are common, they are not necessarily indicative of an issue, and without more detail there's no way to determine there is an issue that needs to be addressed or not.
Yes, they're common, and these are caused by a client sending too many operations over a connection without waiting for them to complete. In other words, a poorly written client.
Simply adding more replicas does nothing to address this, you need a load balancer that spreads all client queries out, even when they're all coming in from a single connection.
Better yet is to identify the client and fix it.
Hello,
Le 25 mars 2024 à 20:52, Howard Chu hyc@symas.com a écrit :
Simply adding more replicas does nothing to address this, you need a load balancer that spreads all client queries out, even when they're all coming in from a single connection.
The load balancer is lloadd isn’t ?
Thanks.
f.g.
— Frédéric Goudal Ingénieur Système, DSI Bordeaux-INP +33 556 84 23 11
On 3/25/24 12:52 PM, Howard Chu hyc@symas.com wrote:
Yes, they're common, and these are caused by a client sending too many operations over a connection without waiting for them to complete. In other words, a poorly written client.
Simply adding more replicas does nothing to address this, you need a load balancer that spreads all client queries out, even when they're all coming in from a single connection.
Better yet is to identify the client and fix it.
I won't disagree with Howard, who knows a lot more than I do about OpenLDAP.
But I do want to add that if you do have a load balancer, and you do see these, then check if your load balancer is using SNAT to manage client connections. Usually load balancers do use SNAT.
In the case you see these errors and you are using load balancers that SNAT client IPs, then adding replicas is a good fix.
Chris Paul | https://www.rexconsulting.net
We don't have load balancers yet, but we have cross site replicas that suffers many client reconnecting when one side is failing and are considering adding both replicas and lb in front.
Thanks for the point. ________________________________ De : chris.paul@rexconsulting.net chris.paul@rexconsulting.net Envoyé : samedi 30 mars 2024 16:36 À : openldap-technical openldap-technical@openldap.org Objet : Re: Help debugging slave slapd issues
[Vous ne recevez pas souvent de courriers de chris.paul@rexconsulting.net. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
ATTENTION : Cet e-mail provient de l'extérieur de l'organisation. Ne cliquez pas sur les liens et n'ouvrez pas les pièces jointes à moins que vous ne reconnaissiez l'expéditeur et que vous sachiez que le contenu est sûr.
On 3/25/24 12:52 PM, Howard Chu hyc@symas.com wrote:
Yes, they're common, and these are caused by a client sending too many operations over a connection without waiting for them to complete. In other words, a poorly written client.
Simply adding more replicas does nothing to address this, you need a load balancer that spreads all client queries out, even when they're all coming in from a single connection.
Better yet is to identify the client and fix it.
I won't disagree with Howard, who knows a lot more than I do about OpenLDAP.
But I do want to add that if you do have a load balancer, and you do see these, then check if your load balancer is using SNAT to manage client connections. Usually load balancers do use SNAT.
In the case you see these errors and you are using load balancers that SNAT client IPs, then adding replicas is a good fix.
Chris Paul | https://www.rexconsulting.net
openldap-technical@openldap.org