https://bugs.openldap.org/show_bug.cgi?id=10138
Issue ID: 10138 Summary: Allow generating multiple nested read transactions from a write transaction Product: LMDB Version: 0.9.30 Hardware: All OS: All Status: UNCONFIRMED Keywords: needs_review Severity: normal Priority: --- Component: liblmdb Assignee: bugs@openldap.org Reporter: renault.cle@gmail.com Target Milestone: ---
Hello,
I have a feature request. Would it be possible to read a database from the point of view of a non-yet-committed write transaction?
What I want to do is to write a lot of entries into a database, use a couple of threads to read those entries (using MDB_NOTLS) to generate a lot of new entries (that will be written to disk and then once the generation is done, drop the read-transaction handles and write (with MDB_APPEND) those new entries from disk into LMDB.
This would have been possible if I had committed the first entries, but unfortunately, it is impossible. I need to do this in the same transaction.
Have a great day, kero
https://bugs.openldap.org/show_bug.cgi?id=10138
Howard Chu hyc@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |INVALID Status|UNCONFIRMED |RESOLVED
--- Comment #1 from Howard Chu hyc@openldap.org --- All reads within a write txn see the not-yet-committed data, so this feature request is pointless.
https://bugs.openldap.org/show_bug.cgi?id=10138
--- Comment #2 from kero renault.cle@gmail.com --- Sorry, my explanation was not clear enough. What [the documentation says][1] is that having multiple transactions on the same thread is forbidden and that a transaction must only be used on a single thread.
A transaction and its cursors must only be used by a single thread, and a thread may only have a single transaction at a time. If #MDB_NOTLS is in use, this does not apply to read-only transactions.
I want to create multiple read transactions on different threads to read the uncommitted changes in parallel. I will not modify the entries while reading them. I want to read in parallel, like I can on the committed changes but on the uncommitted changes.
Once all the read transactions are done, I close them and insert new entries into the database using the original, still uncommitted, write transaction. Then, commit it, and the changes will be visible from an external point of view.
I understand that the read transactions start with the committed root node and can't see the uncommitted new root. Is it an issue to use the uncommitted root as the baseline of the reads?
[1]: https://github.com/LMDB/lmdb/blob/mdb.master/libraries/liblmdb/lmdb.h#L982-L...
https://bugs.openldap.org/show_bug.cgi?id=10138
--- Comment #3 from Howard Chu hyc@openldap.org --- If you guarantee that no writes are being performed while you're doing these reads, and you're not using any special feature (like encryption, which maintains per-txn cache of decrypted pages) then you only need to open multiple cursors on the write txn, and use one cursor per other thread.
https://bugs.openldap.org/show_bug.cgi?id=10138
--- Comment #4 from Howard Chu hyc@openldap.org --- Note: the actual opening and closing of all of the cursors must only be done by one thread, as well.
https://bugs.openldap.org/show_bug.cgi?id=10138
--- Comment #5 from kero renault.cle@gmail.com --- Thank you very much for your answer.
I found a better and probably safer and easier workaround to implement. I store the pointer and length of the entries I want to fetch in a vector from the original write transaction (cursor).
Then, I reference this vector to all of my threads. The threads use this vector to get references to the entries' values directly without having a cursor to the database. I ensure no updates are performed on the entries to keep the pointers valid until the thread has finished.
I am pretty sure this optimization is valid, but not if I use any special feature (like encryption, which maintains per-txn cache of decrypted pages as you told me).
Do you see anything that could go wrong with this solution? Are there other features that can break it?
https://bugs.openldap.org/show_bug.cgi?id=10138
--- Comment #6 from Howard Chu hyc@openldap.org --- Sounds OK. As long as nothing changes any of the write txn's state, you should be fine.
https://bugs.openldap.org/show_bug.cgi?id=10138
--- Comment #7 from Howard Chu hyc@openldap.org --- Note that if you wrote a lot of data in the txn, some dirty pages may have been flushed to make room for newer writes. In that case, any pointers to those flushed pages would be invalid. In that case, you'd have to use cursors to read the data reliably.
https://bugs.openldap.org/show_bug.cgi?id=10138
--- Comment #8 from kero renault.cle@gmail.com ---
As long as nothing changes any of the write txn's state, you should be fine.
Do you mean that it can be fine even if I use the encryption feature?
Note that if you wrote a lot of data in the txn, some dirty pages may have been flushed to make room for newer writes. In that case, any pointers to those flushed pages would be invalid. In that case, you'd have to use cursors to read the data reliably.
Do you mean it is unreliable to keep the data pointers coming from a read-only cursor if I wrote a lot of data into it? Even if I collect those data pointers AFTER I have finished writing entries into the database?
Because, according to the documentation, my solution is valid in the sense that no updates are performed while I iterate on the database entries. Therefore the threads are reading frozen pointers to the mmap area.
Values returned from the database are valid only until a subsequent update operation, or the end of the transaction.
https://bugs.openldap.org/show_bug.cgi?id=10138
--- Comment #9 from Howard Chu hyc@openldap.org --- (In reply to kero from comment #8)
As long as nothing changes any of the write txn's state, you should be fine.
Do you mean that it can be fine even if I use the encryption feature?
No. Accessing records will cycle pages through the cache, thus changing the write txn's state.
Note that if you wrote a lot of data in the txn, some dirty pages may have been flushed to make room for newer writes. In that case, any pointers to those flushed pages would be invalid. In that case, you'd have to use cursors to read the data reliably.
Do you mean it is unreliable to keep the data pointers coming from a read-only cursor if I wrote a lot of data into it? Even if I collect those data pointers AFTER I have finished writing entries into the database?
Ah, if you used a cursor to enumerate the data pointers after all writes were done, then it'll be fine.
Because, according to the documentation, my solution is valid in the sense that no updates are performed while I iterate on the database entries. Therefore the threads are reading frozen pointers to the mmap area.
Values returned from the database are valid only until a subsequent update operation, or the end of the transaction.
https://bugs.openldap.org/show_bug.cgi?id=10138
--- Comment #10 from kero renault.cle@gmail.com ---
No. Accessing records will cycle pages through the cache, thus changing the write txn's state.
Can you point me to the documentation that explains this, please? [The mdb_get documentation][1] is the same as in the mdb.master branch. That's maybe elsewhere?
[1]: https://github.com/LMDB/lmdb/blob/22a41169c1f7a2c6a58c2fb10a4ed55bd8fe0d77/l...
Anyway, thank you very much for your help.
https://bugs.openldap.org/show_bug.cgi?id=10138
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Keywords|needs_review | Status|RESOLVED |VERIFIED
https://bugs.openldap.org/show_bug.cgi?id=10138
--- Comment #11 from kero renault.cle@gmail.com --- Hey Howard,
I would like to know if I can use the same trick as previously stated using the `mdb.master3` branch checksum feature or if pages will cycle through the cache, too.
Also, is there a release page/blog post or anything explaining how the checksum feature works? What is it capable of catching? Is it only checksumming pages when reading them, or can it check all new txn pages just before committing a transaction to be sure that the data is correct?