<quote who="Toby Blake">
Hi all,
Hi Toby.
For largely historical reasons we run slapd servers on most clients (this will probably change in the future - I'm just giving this information as background).
Why?
We're seeing problems when some of these machines are busy, particularly, it seems, with memory intensive activity, although it's hard to substantiate as I generally only see the machines after they've broken. It's annoying as I can't reproduce these problems.
It's going to be hard to pin point then ;-) How much memory/CPU etc. do these clients have and what other services do they provide?
We see quite a few problems with slapd getting into a state where it's deferring operations, for whatever reason - I think I understand these
- these are when slapd basically says sorry, I'm too busy doing X, so
I'll defer Y until I have time. Is this accurate?
Yes. What kind of clients are searching/binding to them? Local?
The second case I'm also seeing is bdb complaining about locks being no longer valid, e.g.
slapd[3780]: bdb(dc=inf,dc=ed,dc=ac,dc=uk): DB_LOCK->lock_put: Lock is no longer valid
slapd seems to keep going for the time being until getting into a state where it defers all binding operations and goes into some kind of spin where it sits at 99% cpu and has to be killed with a -9.
Is everything local? Nothing mounted locally, like NFS for the directory data.
I suppose I have a couple of questions about the "Lock is no longer valid" error....
- What causes it?
- Is it something I can prevent by configuration changes (for instance, would increasing the numbers of locks, lockers and objects help?)
One for the dev team. I do know this is an error message from Berkeley DB by grepping the source.
We're running openldap 2.3.35 with ITS#4924 and ITS#4925 patches with a bdb backend running 4.2.52 with all 6 recommended patches.
I hope you mean 5, as there are only 5 listed on the Oracle site.
The only DBCONFIG settings we currently have are:
dbconfig set_cachesize 0 67108864 1 dbconfig set_lg_regionmax 262144 dbconfig set_lg_bsize 2097152
I take it dbconfig is a keyword you've added for this example, as it's not valid.
Thanks in advance Toby Blake School of Informatics University of Edinburgh