https://bugs.openldap.org/show_bug.cgi?id=10024
Issue ID: 10024 Summary: MDB_PREVSNAPSHOT broken Product: LMDB Version: unspecified Hardware: All OS: All Status: UNCONFIRMED Keywords: needs_review Severity: normal Priority: --- Component: liblmdb Assignee: bugs@openldap.org Reporter: markus@objectbox.io Target Milestone: ---
It seems that the patch #9496 had a negative side effect on MDB_PREVSNAPSHOT. In certain cases, when opening the DB using MDB_PREVSNAPSHOT, the previous (2nd latest) commit is not selected. Instead, reads show that the latest commit was selected voiding the effect of MDB_PREVSNAPSHOT.
I observed this in our test cases a while back. Today, I was finally able to reproduce it and debug into it.
When creating the transaction to read the data, I debugged into mdb_txn_renew0. Here, ti (MDB_txninfo; env->me_txns) was non-NULL. However, ti->mti_txnid was 0 (!) and thus txn->mt_txnid was set to 0. That's the reason for always selecting the first (index 0) meta page inside mdb_txn_renew0:
meta = env->me_metas[txn->mt_txnid & 1];
This line occurs twice (once for read txn and once for write txn; it affects both txn types).
Thus, the chances of MDB_PREVSNAPSHOT selecting the correct meta page is 50-50. It's only correct if the first meta page (index 0) is the older one.
I believe that this is related to #9496 because the patch, that was provided there, removed the initialization of "env->me_txns->mti_txnid" in mdb_env_open2. This would explain why txn->mt_txnid inside mdb_txn_renew0 was set to 0.
I can confirm that adding back the following two lines back in fixes MDB_PREVSNAPSHOT:
if (env->me_txns) env->me_txns->mti_txnid = meta.mm_txnid;
The said patch including the removal of these two lines was applied in the commit(s) "ITS#9496 fix mdb_env_open bug from #8704" (Howard Chu on 09.04.21).
I hope this information is useful to find a suitable fix. Please let me know if you have questions. Also, I'd be happy to help confirming a potential fix with our test suite.