Full_Name: Aaron Richton Version: 2.3.38 OS: Solaris 9 URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (68.196.250.105)
Just noticed that my syslog files were growing faster than usual. Upon further inspection, two slaves have multiple hdb databases corrupt. Both slave{4,6} have been (and are) running slapd since September 4. All are running patched BDB 4.2.52 (same binaries I've been using throughout the whole 2.3 series). All DB_CONFIGs have DB_LOG_AUTOREMOVE set. Messages similar to below are spewing out every checkpoint interval, which is the root cause of my logs growing unusually. I'm inclined to just zap all the databases and start again (they're only slaves), but figured I'd post for tracking and to ask if there's anything that can be grabbed out of the running process before I do so. Curiously enough, base4 only corrupted on slave4, not slave6. Additionally, there are other databases hosted on each slave that appear unaffected.
The first indication of trouble:
Sep 24 09:43:36 slave4.rutgers.edu slapd[295]: [ID 446079 local4.debug] bdb(base1): DB_ENV->log_flush: LSN of 1/8730339 past current end-of-log of 1/188113 Sep 24 09:43:36 slave4.rutgers.edu slapd[295]: [ID 446079 local4.debug] bdb(base1): Database environment corrupt; the wrong log files may have been removed or incompatible database files imported from another environment Sep 24 09:43:36 slave4.rutgers.edu slapd[295]: [ID 446079 local4.debug] bdb(base1): entryCSN.bdb: unable to flush page: 0 Sep 24 09:43:36 slave4.rutgers.edu slapd[295]: [ID 446079 local4.debug] bdb(base1): txn_checkpoint: failed to flush the buffer cache Invalid argument Sep 24 09:43:36 slave4.rutgers.edu slapd[295]: [ID 446079 local4.debug] bdb(base2): DB_ENV->log_flush: LSN of 54/1636114 past current end-of-log of 4/2981780 Sep 24 09:43:36 slave4.rutgers.edu slapd[295]: [ID 446079 local4.debug] bdb(base2): Database environment corrupt; the wrong log files may have been removed or incompatible database files imported from another environment Sep 24 09:43:36 slave4.rutgers.edu slapd[295]: [ID 446079 local4.debug] bdb(base2): entryUUID.bdb: unable to flush page: 0 Sep 24 09:43:36 slave4.rutgers.edu slapd[295]: [ID 446079 local4.debug] bdb(base2): txn_checkpoint: failed to flush the buffer cache Invalid argument Sep 24 09:43:36 slave4.rutgers.edu slapd[295]: [ID 446079 local4.debug] bdb(base3): DB_ENV->log_flush: LSN of 1/600564 past current end-of-log of 1/662 Sep 24 09:43:36 slave4.rutgers.edu slapd[295]: [ID 446079 local4.debug] bdb(base3): Database environment corrupt; the wrong log files may have been removed or incompatible database files imported from another environment Sep 24 09:43:36 slave4.rutgers.edu slapd[295]: [ID 446079 local4.debug] bdb(base3): cn.bdb: unable to flush page: 0 Sep 24 09:43:36 slave4.rutgers.edu slapd[295]: [ID 446079 local4.debug] bdb(base3): txn_checkpoint: failed to flush the buffer cache Invalid argument Sep 24 09:43:36 slave4.rutgers.edu slapd[295]: [ID 446079 local4.debug] bdb(base4): DB_ENV->log_flush: LSN of 3/2765493 past current end-of-log of 1/539 Sep 24 09:43:36 slave4.rutgers.edu slapd[295]: [ID 446079 local4.debug] bdb(base4): Database environment corrupt; the wrong log files may have been removed or incompatible database files imported from another environment Sep 24 09:43:36 slave4.rutgers.edu slapd[295]: [ID 446079 local4.debug] bdb(base4): uid.bdb: unable to flush page: 0 Sep 24 09:43:36 slave4.rutgers.edu slapd[295]: [ID 446079 local4.debug] bdb(base4): txn_checkpoint: failed to flush the buffer cache Invalid argument Sep 24 09:44:49 slave6.rutgers.edu slapd[301]: [ID 446079 local4.debug] bdb(base1): DB_ENV->log_flush: LSN of 1/8730401 past current end-of-log of 1/188113 Sep 24 09:44:49 slave6.rutgers.edu slapd[301]: [ID 446079 local4.debug] bdb(base1): Database environment corrupt; the wrong log files may have been removed or incompatible database files imported from another environment Sep 24 09:44:49 slave6.rutgers.edu slapd[301]: [ID 446079 local4.debug] bdb(base1): entryCSN.bdb: unable to flush page: 0 Sep 24 09:44:49 slave6.rutgers.edu slapd[301]: [ID 446079 local4.debug] bdb(base1): txn_checkpoint: failed to flush the buffer cache Invalid argument Sep 24 09:44:49 slave6.rutgers.edu slapd[301]: [ID 446079 local4.debug] bdb(base2): DB_ENV->log_flush: LSN of 54/1634334 past current end-of-log of 4/1649467 Sep 24 09:44:49 slave6.rutgers.edu slapd[301]: [ID 446079 local4.debug] bdb(base2): Database environment corrupt; the wrong log files may have been removed or incompatible database files imported from another environment Sep 24 09:44:49 slave6.rutgers.edu slapd[301]: [ID 446079 local4.debug] bdb(base2): entryUUID.bdb: unable to flush page: 0 Sep 24 09:44:49 slave6.rutgers.edu slapd[301]: [ID 446079 local4.debug] bdb(base2): txn_checkpoint: failed to flush the buffer cache Invalid argument Sep 24 09:44:49 slave6.rutgers.edu slapd[301]: [ID 446079 local4.debug] bdb(base3): DB_ENV->log_flush: LSN of 1/600564 past current end-of-log of 1/538 Sep 24 09:44:49 slave6.rutgers.edu slapd[301]: [ID 446079 local4.debug] bdb(base3): Database environment corrupt; the wrong log files may have been removed or incompatible database files imported from another environment Sep 24 09:44:49 slave6.rutgers.edu slapd[301]: [ID 446079 local4.debug] bdb(base3): cn.bdb: unable to flush page: 0 Sep 24 09:44:49 slave6.rutgers.edu slapd[301]: [ID 446079 local4.debug] bdb(base3): txn_checkpoint: failed to flush the buffer cache Invalid argument