On Wed, Jun 4, 2014 at 1:04 PM, Howard Chu hyc@symas.com wrote:
Brian G. Merrell wrote:
On Wed, Jun 4, 2014 at 10:22 AM, Howard Chu <hyc@symas.com mailto:hyc@symas.com> wrote:
Brian G. Merrell wrote: Hi all, First, I'm having trouble finding resources to answer a question like this myself, so please forgive me if I've missed something. http://symas.com/mdb/doc/
Thanks. I did see and skim the API portion of the docs before asking, but I was just having trouble knowing how the pieces fit together to solve a problem.
Skimming isn't going to cut it.
Fair enough, I probably gave up prematurely. Blame my inferior intellect, but with zero other context into LMDB, I was having trouble getting a holistic view of LMDB from the docs. From the information you've shared, though, it's made the docs much more approachable. For whatever it's worth, I plan to write something up with my findings that will hopefully help someone.
Your reader process should be using read transactions.
OK, I interpret this as meaning that I need to pass the MDB_RDONLY flag to mdb_txn_begin. Is that correct?
Yes.
In the actual LMDB API read transactions can be reused by their creating thread, so they are zero-cost after the first time. I don't know if any of the other language wrappers leverage this fact.
This helps a lot. I will investigate what the case is with gomdb.
Opening a DBI only needs to be done once per process. Opening per transaction would be stupid, like reopening a file handle on every request.
I suspected so. The fact that mdb_dbi_open takes a transaction had me confused a bit, because I thought I would need to pass in the new transaction every time I got a transaction from mdb_txn_begin.
mdb_dbi_open takes a txn because it needs one if you're creating a DB for the first time. I.e., it must write metadata for the DB into the environment, and all writes to MDB must be inside a txn. But once that txn is committed, the DBI itself lives on until mdb_dbi_close. This is all already explained in the doc for mdb_dbi_open; if you hadn't skimmed you would have seen it already.
Most of this is only a concern when you're using named subDBs. The default unnamed DB always exists, so its DBI is always valid anyway.
I will probably use named subDBs for my real application (instead of 9 separate databases like I do in LevelDB), so thanks for sharing.
I've refactored the reader to look like this:
env = NewEnv() env.Open("/tmp/foo", 0, 0664) txn = BeginTxn(nil, mdb.RDONLY) // parent txn is the nil arg dbi = txn.DBIOpen(nil, 0) txn.Abort()
You want mdb_txn_reset() here, not abort. Abort frees/destroys the txn handle so it cannot be reused.
while { txn = BeginTxn(nil, mdb.RDONLY) // parent txn is the nil arg
and here you want mdb_txn_renew(), to reuse the txn handle instead of creating a new one.
Ahah! Thank you. I had tried this before, but because I had used the txn.Abort() above, things did not go well. Now my benchmark times are back to what I would expect. I.e., they are comparable to the performance I was seeing when I had all transaction code outside of the loop (but wasn't seeing the data being updated after running my writer process).
for i = 0; i < n_entries; i++ { key = sprintf("Key-%d", i) val = txn.Get(dbi, key) print("%s: %s", key, value) } txn.Commit()
and you want mdb_txn_reset() here too, not commit. Commit also frees/destroys the txn handle.
sleep(5)
}
You can abort or commit the txn during your process teardown phase to dispose of it.
env.DBIClose(dbi)
Now, I guess the big question that BeginTxn inside the loop is zero-cost.
Thanks for the tips so far Howard; it has been very helpful.
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/