I've had a spurt of bad luck with 2.4.16 (it appears Quanah and a few others may share that opinion). The seg faults inspired me to run under libumem, which has some interesting features that give you "moderate" debug ability in exchange for moderate performance hit -- small enough that I can run it hot safely, unlike full-featured memory debuggers.
At this point a RE24 checkout from late Saturday has been good for me in production, with some moderate libumem checks enabled. Is everybody else starting to see RE24 shape up? Bottom line...I think I'm now +1 for encouraging a 2.4.17 train, for what it's worth...
--On Monday, May 11, 2009 9:06 AM -0400 Aaron Richton richton@nbcs.rutgers.edu wrote:
I've had a spurt of bad luck with 2.4.16 (it appears Quanah and a few others may share that opinion). The seg faults inspired me to run under libumem, which has some interesting features that give you "moderate" debug ability in exchange for moderate performance hit -- small enough that I can run it hot safely, unlike full-featured memory debuggers.
At this point a RE24 checkout from late Saturday has been good for me in production, with some moderate libumem checks enabled. Is everybody else starting to see RE24 shape up? Bottom line...I think I'm now +1 for encouraging a 2.4.17 train, for what it's worth...
Overall, RE24 looks a lot better than 2.4.16, yes. There's still a nasty BDB deadlock that's cropped up for two of us now, that really needs figuring out.
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
On Mon, 11 May 2009, Quanah Gibson-Mount wrote:
Overall, RE24 looks a lot better than 2.4.16, yes. There's still a nasty BDB deadlock that's cropped up for two of us now, that really needs figuring out.
Hmm. Given a clean shutdown:
slapd[2085]: [ID 707592 local4.debug] slapd shutdown: waiting for 0 operations/tasks to finish
that looks like it completed fine:
# ps -ef | grep 2085 root 15268 1261 0 15:14:17 pts/1 0:00 grep 2085
is it expected that db_stat would still show:
# /usr/local/db4/bin/sparcv9/db_stat -CA | grep pid | cut -d/ -f2 | cut -d\ -f2 | sort | uniq -c 32 2085
notice pid 2085 matches? I'd expect that to be gone--is that wrong?
I just got that now with RE24.
Aaron Richton wrote:
On Mon, 11 May 2009, Quanah Gibson-Mount wrote:
Overall, RE24 looks a lot better than 2.4.16, yes. There's still a nasty BDB deadlock that's cropped up for two of us now, that really needs figuring out.
Hmm. Given a clean shutdown:
slapd[2085]: [ID 707592 local4.debug] slapd shutdown: waiting for 0 operations/tasks to finish
that looks like it completed fine:
# ps -ef | grep 2085 root 15268 1261 0 15:14:17 pts/1 0:00 grep 2085
is it expected that db_stat would still show:
# /usr/local/db4/bin/sparcv9/db_stat -CA | grep pid | cut -d/ -f2 | cut -d\ -f2 | sort | uniq -c 32 2085
notice pid 2085 matches? I'd expect that to be gone--is that wrong?
It should be gone. Show the full db_stat -CA output.
https://www.nbcs.rutgers.edu/~richton/dbstat-leftoverpid.txt
I was very careful to manually db_recover prior to start. "BEFORE" was while slapd was running. I then:
# pgrep slapd 15889 # pkill -INT slapd [watched for] slapd[15889]: [ID 707592 local4.debug] slapd shutdown: waiting for 0 operations/tasks to finish [then the process went away, although I'm more than a bit confused by no "slapd stopped" -- does that not syslog?]
and then I generated the "AFTER" run.
Aaron Richton wrote:
https://www.nbcs.rutgers.edu/~richton/dbstat-leftoverpid.txt
I was very careful to manually db_recover prior to start. "BEFORE" was while slapd was running. I then:
# pgrep slapd 15889 # pkill -INT slapd [watched for] slapd[15889]: [ID 707592 local4.debug] slapd shutdown: waiting for 0 operations/tasks to finish [then the process went away, although I'm more than a bit confused by no "slapd stopped" -- does that not syslog?]
and then I generated the "AFTER" run.
That looks completely wrong; there should be 0 current lockers after slapd exits. Run under gdb and see what happens at shutdown time.
Quanah Gibson-Mount wrote:
--On Monday, May 11, 2009 9:06 AM -0400 Aaron Richton richton@nbcs.rutgers.edu wrote:
I've had a spurt of bad luck with 2.4.16 (it appears Quanah and a few others may share that opinion). The seg faults inspired me to run under libumem, which has some interesting features that give you "moderate" debug ability in exchange for moderate performance hit -- small enough that I can run it hot safely, unlike full-featured memory debuggers.
At this point a RE24 checkout from late Saturday has been good for me in production, with some moderate libumem checks enabled. Is everybody else starting to see RE24 shape up? Bottom line...I think I'm now +1 for encouraging a 2.4.17 train, for what it's worth...
Overall, RE24 looks a lot better than 2.4.16, yes.
I'm still experiencing make test to fail occassionally in various tests (e.g. see ITS#6126).
Ciao, Michael.