(LMDB) read-only transactions and DBI handles

List overview All Threads
Download

newer

older

relax control

LMDB transcations - how

Viacheslav Usov

10 Dec 2015 10 Dec '15

5:50 p.m.

I have a question about mdb_dbi_open and the handles it returns. The official documentation suggests to me that mdb_dbi_open is supposed to be called early on and then the obtained handle should just be reused by other transactions without any further calls to mdb_dbi_open and mdb_dbi_close. The code in sample-mdb.txt does that (ignoring the final close-all sequence).

The sample also uses mdb_txn_abort to end a read-only transaction, which I take as the recommended way to end read-only transactions. The docs state that if a transaction aborts, then the associated DBI handles are closed, and so mdb_dbi_open must be used again to obtain a valid handle, which, as far as I can tell, requires a process-wide serialization of transactions with calls to mdb_dbi_open. Which feels contrary to the style described above, and, more importantly, just wrong with read-only transactions, because the need to serialize them would negate the benefits of LMDB's lock-free paradigm.

Which probably means I misunderstand something. Can read-only transactions keep DBI handles reusable? How?

Thanks, V.

Attachments:

attachment.htm (text/html — 1.3 KB)

Show replies by date

Howard Chu

11 Dec 11 Dec

3:37 a.m.

Viacheslav Usov wrote:

...

I have a question about mdb_dbi_open and the handles it returns. The official documentation suggests to me that mdb_dbi_open is supposed to be called early on and then the obtained handle should just be reused by other transactions without any further calls to mdb_dbi_open and mdb_dbi_close. The code in sample-mdb.txt does that (ignoring the final close-all sequence).

The sample also uses mdb_txn_abort to end a read-only transaction, which I take as the recommended way to end read-only transactions.

Transactional databases are very simple:

abort is the way to end *any* transaction whose operations you wish to discard.

commit is the way to end *any* transaction whose operations you wish to persist.

If you open a DBI in a read-only transaction and you want that DBI to persist after ending the transaction, then commit the transaction.

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Viacheslav Usov

12:38 p.m.

On Fri, Dec 11, 2015 at 3:37 AM, Howard Chu hyc@symas.com wrote:

...

abort is the way to end *any* transaction whose operations you wish to

discard.

...

commit is the way to end *any* transaction whose operations you wish to

persist.

If one is to interpret DBI handles as entities that can be created, persisted and discarded by transactions, then this interpretation is difficult to reconcile with the fact that this can be done by a read-only transaction.

This interpretation is even more problematic if one considers that a transaction, including a read-only transaction, can discard entities it never created to begin with.

At a more fundamental level, LMDB is said to support "multiple sub-databases [...] with transactions covering all sub-databases" [1]. However, LMDB's API with respect to sub-databases and transactions is not orthogonal [2]. With an orthogonal API, a DBI handle would remain valid as long as the underlying sub-database existed, regardless of commits and aborts in transactions using it.

Now, I am not saying that LMDB must be orthogonal in that sense. This is definitely a design choice, and there may be good reasons why it is impossible or difficult for LMDB to behave that way. What I am saying, however, is that such a mode would be very nice to have, if it is not too difficult to implement. It could be requested by a flag passed to, preferably, mdb_dbi_open or to mdb_txn_begin.

...

If you open a DBI in a read-only transaction and you want that DBI to

persist after ending the transaction, then commit the transaction.

Can you confirm that, with read-only transactions, the only difference between commit and abort is in the way the associated DBI handles are treated, and, specifically, that no observable effect on performance will occur? If that is so, a note in the documentation stating this would be very helpful.

Cheers, V.

[1] http://symas.com/mdb/

[2] http://www.catb.org/jargon/html/O/orthogonal.html

Christian Sell

2:19 p.m.

Hi,

...

Viacheslav Usov via.usov@gmail.com hat am 11. Dezember 2015 um 12:38 geschrieben: However, LMDB's API with respect to sub-databases and transactions is not orthogonal [2]. With an orthogonal API, a DBI handle would > remain valid as long as the underlying sub-database existed, regardless of commits and aborts in transactions using it.

as I understand it (and am practicing it), the dbi is created by one transaction and remains valid after that transaction commits until it is either explicitly closed or the environment dies. What *is* somewhat awkward is the fact that it can be a read transaction that creates it. In my application, I am creating a dbi handle for each (sub-)database using a short transaction for just that purpose and then save it somewhere.

Chris

Howard Chu

2:53 p.m.

Christian Sell wrote:

...

Hi,

...
Viacheslav Usov via.usov@gmail.com hat am 11. Dezember 2015 um 12:38 geschrieben: However, LMDB's API with respect to sub-databases and transactions is not orthogonal [2]. With an orthogonal API, a DBI handle would > remain valid as long as the underlying sub-database existed, regardless of commits and aborts in transactions using it.

as I understand it (and am practicing it), the dbi is created by one transaction and remains valid after that transaction commits until it is either explicitly closed or the environment dies. What *is* somewhat awkward is the fact that it can be a read transaction that creates it. In my application, I am creating a dbi handle for each (sub-)database using a short transaction for just that purpose and then save it somewhere.

There's nothing awkward here; it is essential. If you are creating a Sub-DB then you must use a write transaction, since you are actually altering the underlying DB environment. If you are simply accessing an existing Sub-DB then there is no reason to require a write transaction.

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Viacheslav Usov

3:46 p.m.

On Fri, Dec 11, 2015 at 2:19 PM, Christian Sell christian@gsvitec.com wrote:

...

as I understand it (and am practicing it), the dbi is created by one

transaction and remains valid after that transaction commits until it is either explicitly closed or the environment dies.

If that is indeed the case, then my previous messages were based on a misunderstanding. I hope Howard can confirm this. Unfortunately, the documentation is confusing with respect to this point.

Cheers, V.

Christian Sell

5:27 p.m.

...

...
as I understand it (and am practicing it), the dbi is created by one transaction and remains valid after that transaction commits until it is either explicitly closed or the environment dies.

If that is indeed the case, then my previous messages were based on a misunderstanding. I hope Howard can confirm this. Unfortunately, the documentation is confusing with respect to this point.

there is documentation on this very matter, indeed. Here's an excerpt from the documentation of the mdb_dbi_open function:

A database handle denotes the name and parameters of a database, independently of whether such a database exists. The database handle may be discarded by calling mdb_dbi_close(). The old database handle is returned if the database was already open. The handle may only be closed once.

The database handle will be private to the current transaction until the transaction is successfully committed. If the transaction is aborted the handle will be closed automatically. After a successful commit the handle will reside in the shared environment, and may be used by other transactions.

This function must not be called from multiple concurrent transactions in the same process. A transaction that uses this function must finish (either commit or abort) before any other transaction in the process may use this function.

Chris

Howard Chu

2:51 p.m.

Viacheslav Usov wrote:

...

On Fri, Dec 11, 2015 at 3:37 AM, Howard Chu <hyc@symas.com mailto:hyc@symas.com> wrote:

...
abort is the way to end *any* transaction whose operations you wish to discard.

...
commit is the way to end *any* transaction whose operations you wish to

persist.

If one is to interpret DBI handles as entities that can be created, persisted and discarded by transactions, then this interpretation is difficult to reconcile with the fact that this can be done by a read-only transaction.

A DBI handle is simply a slot in an in-memory array. Read-only transactions are read-only with respect to the database.

...

This interpretation is even more problematic if one considers that a transaction, including a read-only transaction, can discard entities it never created to begin with.

No, no transaction can discard something that it didn't create.

...

At a more fundamental level, LMDB is said to support "multiple sub-databases [...] with transactions covering all sub-databases" [1]. However, LMDB's API with respect to sub-databases and transactions is not orthogonal [2]. With an orthogonal API, a DBI handle would remain valid as long as the underlying sub-database existed, regardless of commits and aborts in transactions using it.

No. Aborting a transaction essentially means that no operation that was performed inside the transaction actually took place. Sub-DBs exist within the database regardless of whether any process has a DBI handle open on them.

Your statement is preposterous. It is the same as saying that file handles in an operating system must stay open as long as files exist in a filesystem. Utter nonsense.

...

Now, I am not saying that LMDB must be orthogonal in that sense. This is definitely a design choice, and there may be good reasons why it is impossible or difficult for LMDB to behave that way. What I am saying, however, is that such a mode would be very nice to have, if it is not too difficult to implement. It could be requested by a flag passed to, preferably, mdb_dbi_open or to mdb_txn_begin.

Nonsense.

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Viacheslav Usov

3:41 p.m.

On Fri, Dec 11, 2015 at 2:51 PM, Howard Chu hyc@symas.com wrote:

...

A DBI handle is simply a slot in an in-memory array. Read-only

transactions are read-only with respect to the database.

That is fine with me. It was you who tried to explain the DBI handles in terms of 'transactional databases" [1]. That explanation was problematic; copying your manner of speaking, one could say it was "utter nonsense".

Can you clarify just this point: in sample-mdb.txt, a DBI handle is opened in the first transaction, which commits, and is used the second transaction, which aborts. What happens with the DBI handle when the second transaction aborts? Does it stay valid or does it get closed?

...

Your statement is preposterous. It is the same as saying that file

handles in an operating system must stay open as long as files exist in a filesystem. Utter nonsense.

Oh, you mean I did not specify every little detail, such as "a DBI handle, once successfully created and persisted and not explicitly closed, ...". Sorry about that.

...

Nonsense.

I appreciate the insightful explanation. There was another question in my message that you did not answer. I am copying it here for convenience.

Cheers, V.

[1] Howard Chu: "Transactional databases are very simple"

Howard Chu

4:24 p.m.

Viacheslav Usov wrote:

...

On Fri, Dec 11, 2015 at 2:51 PM, Howard Chu <hyc@symas.com mailto:hyc@symas.com> wrote:

...
A DBI handle is simply a slot in an in-memory array. Read-only transactions

are read-only with respect to the database.

That is fine with me. It was you who tried to explain the DBI handles in terms of 'transactional databases" [1]. That explanation was problematic; copying your manner of speaking, one could say it was "utter nonsense".

Can you clarify just this point: in sample-mdb.txt, a DBI handle is opened in the first transaction, which commits, and is used the second transaction, which aborts. What happens with the DBI handle when the second transaction aborts? Does it stay valid or does it get closed?

What happens when you write a record to the DB in the first transaction, which commits? When you start a second transaction, which aborts, does that record stay, or does it disappear?

What does the word "persist" mean?

Aborting a transaction only discards the operations that occurred within that transaction.

...

There was another question in my message that you did not answer. I am copying it here for convenience.

And I am ignoring it, again.

...

Can you confirm that, with read-only transactions, the only difference between commit and abort is in the way the associated DBI handles are treated, and, specifically, that no observable effect on performance will occur? If that is so, a note in the documentation stating this would be very helpful.

Cheers, V.

[1] Howard Chu: "Transactional databases are very simple"

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Viacheslav Usov

4:42 p.m.

On Fri, Dec 11, 2015 at 4:24 PM, Howard Chu hyc@symas.com wrote:

...

Aborting a transaction only discards the operations that occurred within

that transaction.

I take this as a way of saying "a transaction won't invalidate a DBI successfully opened by a previous transaction". I suggest you add this clarification to the caveats section in the docs, this is not really obvious, even if you think otherwise. Remember how it was obvious to you that memory-mapped files in Windows cannot grow incrementally, while that was utter nonsense in my book :)

I am guessing your are not willing to explain the "obvious" non-difference between commit and abort in read-only transactions, but, again, others will appreciate an explicit clarification.

Cheers, V.

3476

Age (days ago)

3477

Last active (days ago)

openldap-technical@openldap.org

10 comments

3 participants

tags (0)

participants (3)

Christian Sell
Howard Chu
Viacheslav Usov