Re: slapd crashing "randomly?"

12 Feb 2007


      On Feb 6, 2007, at 3:34 PM, Quanah Gibson-Mount wrote:
...
--On Tuesday, February 06, 2007 1:35 PM -0500 matthew sporleder  
msporleder@gmail.com wrote:
...
On 2/6/07, daniel@ncsu.edu daniel@ncsu.edu wrote:
...
Hi folk,
I want to start this message by saying, what I'm about to  
describe is
completely vague and I don't expect to get a solution response.  ;)
Basically, I'm out of ideas and am looking for some suggestions  
as to how
to debug the issue I'm running into.
Starting about half a year ago, slapd started "just dieing" out  
of the
blue.  Not a think in the logs shows up to indicate what might have
caused it.  The last query that I see in the logs before a crash  
always
seems to be nothing special.  I don't even see a core dump being
generated yet, but then that may just be because I don't have the  
proper
setup to get a core dump at this time.  We were running the last  
2.2 and
upgraded to the latest release of 2.3 to make sure it wasn't an "old
version" issue. Unfortunately, slapd still dies a fair amount on  
us.  It
appears to be fairly unpredictable.  I've seen it crash within 1  
minute
of starting up slapd (then a subsequent startup 'takes' just fine).
I've seen it crash when there were a number of network issues  
going on.
I've seen it crash out of the blue when nothing appeared to be  
going on.
I don't really have the drive space to turn on max debug logging  
24/7
until the problem occurs.
We're thinking about setting up something to watch all of the  
network
traffic going to one of the boxes until it dies.  (assuming we  
can find
something with the resources to do that)
That all said...  since I have nothing solid to present, do you  
all have
any suggestions of what would be the best way to track down  
what's going
on?  I'm literally out of ideas unless my berkeley db config is  
somehow
causing the problem or something like that.
I apologize for the vagueness.  =/  Any ideas/suggestions?
After the crash, is your bdb environment clean, or is it needing a
db_recover?
Depending on your OS, you could watch the pid all the time and trap
the last signals received, last files accessed, etc, and that  
wouldn't
take tons of resources.
You could try turning on max debugging and simply rotate a lot more
often.  (every n minutes or even seconds)  This way you could
definitely keep the -last- transactions and just not worry about the
old ones.
Also, what database backend are you using?  Why not build slapd  
with debugging symbols so you can get a core?
BDB
and I am planning on doing so  ;D
...
What version of 2.3 are you running at the moment?  You say you had  
upgraded to the latest release at some point, but not what release  
that was.  Up until around 2.3.28, there were issues in the  
connection code that caused random crashes on my servers.  2.3.33  
would be your best bet to eliminate that as an issue if you aren't  
there yet.
2.3.32 is what we're running right now.  I've been sticking with the  
version that's labelled as "stable".  Do y'all recommend going with  
the release instead of the "stable"?
I've at least been having this issue since 2.2.whatever, so it's been  
going on for quite some time version wise.  Timewise, I still think  
something may have changed in my world to cause all of this, but just  
can't track it down.
Anyway, I'm working on setting up some things with which I can track it.
Thanks!
Daniel
...
--Quanah
--
Quanah Gibson-Mount
Principal Software Developer
ITS/Shared Application Services
Stanford University
GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: slapd crashing "randomly?"