Aaron Richton wrote:
Backend is hdb. The add was done under 2.3.37, the server has since been upgraded to 2.3.40 (last Thursday).
db_dump -p does show a high character...
\80\10cn=172.23.58.210\00cn=1\b72.23.58.210\00\00\00\00\00\00\01\e4e
which I suppose you're theorizing is off-by-one on the disk. BTW, slapcat doesn't agree with ldapsearch here, which might be construed as a bug? (Although it's tough to call something a bug in an impossible situation, I admit.)
I dunno. I suppose if the hardware is bad then I might be seeing all sorts of crazy stuff. But we're talking modern sparc/solaris hardware: all my other boxes are pretty good about screaming during their death. And they also don't have the amazingly-coincidental upgrade to 2.3.40 from last Thursday, either. (Which I started seeing err=80 returns within 24 hours of, after not seeing them...well, almost ever in production.) It's all just a bit too perfect a storm for my taste...but I'd obviously defer to you when it comes to bdb internals.
Well, if the entry was created a while ago and hasn't ever been modrdn'd then this DB record would never have been touched since then. (Or more to the point, not modrdn'd since the 2.3.40 upgrade.) It's possible that a software bug might have caused a stray bitwise OR to toggle that bit in an in-memory cached page, but seems unlikely that a page corrupted in that manner would have been written back to disk. Unless of course this record is one of the last ones in the DB before the upgrade, and new entries were added right after it. ("Last" is obviously imprecise. But the point is you would have had to have created some new entries after the upgrade whose DNs would cause their records to fall on the same DB page as this record, meaning they're close in the sort order and such.)
With that in mind, I might as well see through your theory. I'll play with a second box tomorrow and see if I can reproduce fresh.
And for most err=80 situations, there ought to be BerkeleyDB error messages in the log as well.