Re: mdb_dbi_open and threads

22 May 2017


      Hallvard wrote
"Currently a moderate number of slots are cheap but a huge number gets
expensive: 7-120 words per transaction, and every #mdb_dbi_open()
does a linear search of the opened slots."
I haven't seen a performance hit with around 10000 named databases. By the
way, I was hoping
to only open those dbi's on demand rather than opening all at
iniatialization.
"With threads 1 and 2 coexisting? When thread 2 called mdb_dbi_open(),
thread 1's prospect of using mdb_dbi_open() at all was lost."
Yeah with both coexisting. Thats what I thought.
@Klaus
Yeah. I know there can be only one write transactions. I was talking about
1 write and 1 or more read transactions.
It is not as if I am first looking to open dbi in the read transaction. It
is because I can't guarantee whether another read transaction will
start and will attempt to open the same named dbi when a write is in
progress.
"And first looking in a read transaction whether a database exists and then
creating it in a second write transaction is definitely a bad and risky
programming style, as it carries an assumption from one transaction to the
next, which is typically not valid."
That was not what I tried to do.
"you still have the option to combine all your logical databases into a big
single database"
Its a workaround that I haven't thought about before. Hoping to avoid the
extra complexity.
Is there any prospect of implementing mdb_dbi_open or mdb_db_open_immediate
to put the dbi into the shared environment without waiting for txn commit.
I learned earlier from Howard Chu that it is not a wanted phenomenon in
ACID. But just in case, because otherwise (without opening all the dbi's in
initialization) in a multi-threaded
environment, the possibility to open a dbi on demand ending in failure goes
up.
On Mon, May 22, 2017 at 2:01 PM, Klaus Malorny Klaus.Malorny@knipp.de
wrote:
...
On 5/21/17 9:43 PM, Muhammed Muneer wrote:
...
Howard Chu wrote
"Just follow the recommendation to open all handles at the beginning of
the program."
But what if I have lots of named databases like maybe 10000 or more.
Wouldn't this be expensive.
I am developing a MongoDB like database (similar in query and update
syntax) around LMDB.
The thing is I have some enhancements on my own like the ability to
generate update queries
from within an ongoing update.
So in a multi threaded environment, if the name of a named dbi is
generated from within a write
transaction (thread1) and proceeds to mdb_dbi_open it only to find that
another read transaction
(thread 2) just opened the same named dbi after the write-txn of thread 1
started, the prospect of
mdb_dbi_open the same named dbi for thread 1 is lost forever.
Please remember that you can have only one writing transaction at once.
And first looking in a read transaction whether a database exists and then
creating it in a second write transaction is definitely a bad and risky
programming style, as it carries an assumption from one transaction to the
next, which is typically not valid.
I have no experience with a large number of databases, but if it is a
performance problem as Hallvard and the docs describe, then you still have
the option to combine all your logical databases into a big single
database. In this case you would maintain a database ID (e.g. four byte
integer) that is prepended to the user provided key for all get and put
operations. Only some care needs to be taken for range searches and cursor
operations, as you might get a key/value pair that belongs to another
logical database, but this is not a big deal. I use that approach for
composite search keys quite a lot.
The association between database names and their IDs could be maintained
in a separate database.
Regards,
Klaus

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

Re: mdb_dbi_open and threads