----- richton@nbcs.rutgers.edu wrote:
I don't have #5 (sleepycat#14657) nor the unofficial http://www.stanford.edu/services/directory/openldap/configuration/patches/db...
patch. As for the official one, I'm not sure about its relevance to the actual SEGV due to the "recovery...fail" comment. In other words, though it may be impacting the ability of alock/db_recover to do its thing, that's just a side effect of the unclean shutdown which is the real bug here to my view.
Patch #5 specifically deals with a race condition where a checkpoint is occuring while a cache buffer retrieval is also occuring causing a database corruption that will later not be recoverable from. At least, that's how I read sleepcat's description:
5. Fix a bug where cache buffer retrieval could race with a checkpoint call, potentially causing database environment recovery to fail. [#14657]
Given that OpenLDAP checkpoints on shutdown, shutting down the server could be what is triggering the issue for you. I'd suggest applying the patch and seeing if this resolves your problem.
The region size patch is interesting, but I will tell you that the database in question has
set_cachesize 0 200000000 0
and it (to a glance) looks like that only impacts the gig column, which I have as zero anyway.
Yeah, the patch may not apply for you (I have a 3.5GB cache, so it does for me). Wouldn't harm anything, of course, if you decided later you needed a larger BDB cache. ;)
--Quanah