Hello
Improved handling for large number of databases ===============================================
There is a increased performance penalty the more databases are created within the same environment. I was looking for a way the improve that by keeping the simplicity of tracking databases within a list with direct access by index (MDB_dbi). mdb_dbi_open() is however not improved with the assumption that the database handle (dbi) is cached in the application. So mdb_dbi_open() should happen only once for each database during the life time of an application.
One issue is that mdb_txn_begin() (for read-only transactions) calloc the sizeof(MDB_txn) + me_maxdbs * sizeof(MDB_db + 1). The plus 1 for the dbflags. However it is sufficient only to malloc that size and clear the sizeof(MDB_txn)
memset(txn, 0, sizeof(MDB_txn)
After that the data beyond the MDB_txn is not initialized which is ok for the moment.
The next improvement happens in mdb_txn_renew0() where the dbflags are only set to DB_UNUSED (a new flag) for each database currently opened in the environment.
memset(txn->mt_dbflags, DB_UNUSED, txn->mt_numdbs);
The former code used to to loop through each database to calculate the dbflags. This is still done but lazily for each accessed database with the assumption that a read only transaction rarely uses all databases of the environment.
The lazy initialization of the dbflag happens in the macro TXN_DBI_EXIST which is always used when a database handle (dbi) is passed to an function. The flags are updated in mdb_setup_db_info() once a database is access which is marked as unused (DB_UNUSED).
static int mdb_setup_db_info(MDB_txn *txn, MDB_dbi dbi) { /* Setup db info */ uint16_t x = txn->mt_env->me_dbflags[dbi]; txn->mt_dbs[dbi].md_flags = x & PERSISTENT_FLAGS; txn->mt_dbflags[dbi] = (x & MDB_VALID) ? DB_VALID|DB_USRVALID|DB_STALE : 0; return (txn->mt_dbflags[dbi] & validity); }
/** Check \b txn and \b dbi arguments to a function and initialize db info if needed */ #define TXN_DBI_EXIST(txn, dbi, validity) \ ((txn) && (dbi)<(txn)->mt_numdbs && (((txn)->mt_dbflags[dbi] & (validity)) || (((txn)->mt_dbflags[dbi] & DB_UNUSED) && mdb_setup_db_info((txn), (dbi), (validity)))))
The next improvement is done in any function which needs to loop through the databases for example in mdb_cursors_close(). Again the more databases in the environment the longer the execution time. It should be best if looping only through dbflags and searching for those databases which are used (!DB_UNUSED). This could be done byte wise or more efficient in 8/4 byte steps comparing with an extended mask DB_UNUSED_LONG instead of DB_UNUSED. So we can skip 8 or 4 (32 bit) unused databases in one step (still with the assumption that a transaction rarely uses all databases of the environment).
So the loop looks as follows always starting at the lower index to avoid alignment issues with ARM prior v6.
#define DB_UNUSED 0x20 /**< DB not used in this txn */
#ifdef MDB_VL32 #define DB_UNUSED_LONG 0x20202020 /* DB_UNUSED long mask for fast tracking */ #else #define DB_UNUSED_LONG 0x2020202020202020 /* DB_UNUSED long mask for fast tracking */ #endif
#ifdef MDB_VL32 #define MDB_WORD unsigned int #else #define MDB_WORD unsigned long long #endif
MDB_dbi n = src->mt_numdbs; MDB_dbi i = 0;
while (1) { unsigned int upper = i + sizeof(MDB_WORD); if (upper < n) { // skip unused if ((*(MDB_WORD *)(tdbflags + i)) == DB_UNUSED_LONG) { i = upper; continue; } } else { upper = n; }
for (; i < upper; i++) { // any other filter criteria appropriate to the function .... } if (i >= n) { break; } }
Access newly opened database from another transaction =======================================================
A transaction tracks newly opened databases and if the transaction is committed the newly opened databases are propagated to the list of open databases of the environment. However if the read only transaction is aborted the databases are not propagated. If database handles (MDB_dbi) are cached in the application to avoid calling mdb_dbi_open() there might be the situation of two threads running a read only transaction concurrently.
First threads opens the database and commits the read-only transaction. The database is added to the list of open databases in the environment. The returned dbi is globally cached in the application.
The second thread also wants to access the same database and finds the database handle in the global application cache. The database handle however is not valid as the transaction only uses a snapshot of open databases. So the second thread gets an EINVAL when using that database handle. This should not happen as the database is open and added to the environment.
The following should fix this issue by updating the mt_numdbs and marking the delta with DB_UNUSED
static int mdb_update_db_info(MDB_txn *txn, MDB_dbi dbi) { /* propagate newly opened db from env to current txn. Mark them as unused */ if (txn->mt_numdbs < txn->mt_env->me_numdbs) { memset(txn->mt_dbflags + txn->mt_numdbs, DB_UNUSED, txn->mt_env->me_numdbs - txn->mt_numdbs); } txn->mt_numdbs = txn->mt_env->me_numdbs;
return dbi < txn->mt_numdbs; }
/** Check \b txn and \b dbi arguments to a function and initialize db info if needed */ #define TXN_DBI_EXIST(txn, dbi, validity) \ ((txn) && ((dbi)<(txn)->mt_numdbs || mdb_update_db_info((txn), (dbi))) && (((txn)->mt_dbflags[dbi] & (validity)) || (((txn)->mt_dbflags[dbi] & DB_UNUSED) && mdb_setup_db_info((txn), (dbi), (validity)))))
If interested let me know how to contribute.
Hope it is useful!
Regards Juerg
Jürg Bircher wrote:
Hello
Improved handling for large number of databases
If interested let me know how to contribute.
Looks interesting, yes. I assume you have profiled the code before and after the suggested changes, please provide your profiling results.
Please read the Developer Guidelines. http://www.openldap.org/devel/contributing.html
Access newly opened database from another transaction
Sounds like an oddball case. Applications should open all their DBIs from a single thread and not start any other threads/transactions until all setup is completed.
Hope it is useful!
Thanks.
Here the test results (run on a iMac 2.7 GHz Intel Core i5)
lmdb improved
[1000000] iterations (begin, cursor_open, cursor_close, abort) with [10] databases in [0.41516] seconds [1000000] iterations (begin, cursor_open, cursor_close, abort) with [100] databases in [0.35304] seconds [1000000] iterations (begin, cursor_open, cursor_close, abort) with [1000] databases in [0.49425] seconds [1000000] iterations (begin, cursor_open, cursor_close, abort) with [10000] databases in [2.23236] seconds [1000000] iterations (begin, cursor_open, cursor_close, abort) with [100000] databases in [15.28527] seconds
lmdb original
[1000000] iterations (begin, cursor_open, cursor_close, abort) with [10] databases in [0.35039] seconds [1000000] iterations (begin, cursor_open, cursor_close, abort) with [100] databases in [0.65547] seconds [1000000] iterations (begin, cursor_open, cursor_close, abort) with [1000] databases in [5.48897] seconds [1000000] iterations (begin, cursor_open, cursor_close, abort) with [10000] databases in [67.13091] seconds [1000000] iterations (begin, cursor_open, cursor_close, abort) with [100000] databases in [781.53778] seconds
As expected with small number of databases the original lmdb is slightly faster but the improved handling quickly outperforms the original implementation.
Test code:
#include "lmdb.h" #include <stdio.h> #include <stdlib.h> #include <time.h> #include <unistd.h> #include <sys/time.h> #include <mach/clock.h> #include <mach/mach.h>
static char *env_name = "/Developer/tmp/testdb";
#define MAX_MAP_SIZ (1024 * 1024 * 100)
#define NUM_ITERATIPONS (1000 * 1000)
#define E(expr) CHECK((rc = (expr)) == MDB_SUCCESS, #expr) #define RES(err, expr) ((rc = expr) == (err) || (CHECK(!rc, #expr), 0)) #define CHECK(test, msg) ((test) ? (void)0 : ((void)fprintf(stderr, \ "%s:%d: %s: %s\n", __FILE__, __LINE__, msg, mdb_strerror(rc)), abort()))
static MDB_env *env; static MDB_dbi main_dbi;
static MDB_dbi numDbs = 0; static MDB_dbi *dbi;
void setup(unsigned int dbNum) { int rc;
E(mdb_env_create(&env)); E(mdb_env_set_maxreaders(env, 1)); E(mdb_env_set_maxdbs(env, dbNum)); E(mdb_env_set_mapsize(env, MAX_MAP_SIZ)); E(mdb_env_open(env, env_name, 0, 0664));
numDbs = dbNum; dbi = malloc(sizeof(MDB_dbi) * numDbs); MDB_txn *txn;
E(mdb_txn_begin(env, NULL, 0, &txn)); E(mdb_dbi_open(txn, NULL, 0, &main_dbi));
for (unsigned int i = 0; i < numDbs; i++) { char name[16];
sprintf(name, "%03x", i);
E(mdb_dbi_open(txn, name, MDB_CREATE, &dbi[i])); }
E(mdb_txn_commit(txn)); }
void cleanup() { mdb_env_close(env);
char name[1024];
sprintf(name, "%s/data.mdb", env_name); unlink(name); sprintf(name, "%s/lock.mdb", env_name); unlink(name); }
struct timespec get_time() { struct timespec ts;
clock_serv_t cclock; mach_timespec_t mts; host_get_clock_service(mach_host_self(), CALENDAR_CLOCK, &cclock); clock_get_time(cclock, &mts); mach_port_deallocate(mach_task_self(), cclock); ts.tv_sec = mts.tv_sec; ts.tv_nsec = mts.tv_nsec;
return ts; }
void test(unsigned int num_iterations) { int rc; MDB_txn *txn; MDB_cursor *cursor;
struct timespec ts = get_time();
for (unsigned int i = 0; i < num_iterations; i++) { E(mdb_txn_begin(env, NULL, MDB_RDONLY, &txn));
E(mdb_cursor_open(txn, dbi[0], &cursor));
mdb_cursor_close(cursor); mdb_txn_abort(txn); }
struct timespec te = get_time();
printf("[%d] iterations (begin, cursor_open, cursor_close, abort) with [%d] databases in [%.5f] seconds\n\n", num_iterations, numDbs, ((double)te.tv_sec + 1.0e-9*te.tv_nsec) - ((double)ts.tv_sec + 1.0e-9*ts.tv_nsec)); }
int main(int argc,char * argv[]) { setup(10); test(1000 * 1000); cleanup();
setup(100); test(1000 * 1000); cleanup();
setup(1000); test(1000 * 1000); cleanup();
setup(10000); test(1000 * 1000); cleanup();
setup(100000); test(1000 * 1000); cleanup();
return 0; }
On 27/05/16 07:37, "Howard Chu" hyc@symas.com wrote:
Jürg Bircher wrote:
Hello
Improved handling for large number of databases
If interested let me know how to contribute.
Looks interesting, yes. I assume you have profiled the code before and after the suggested changes, please provide your profiling results.
Please read the Developer Guidelines. http://www.openldap.org/devel/contributing.html
Access newly opened database from another transaction
Sounds like an oddball case. Applications should open all their DBIs from a single thread and not start any other threads/transactions until all setup is completed.
Hope it is useful!
Thanks.
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
Some more test results using mdb_txn_renew() and mdb_txn_reset():
lmdb improved
[1000000] iterations (renew, cursor_open, cursor_close, reset) with [10] databases in [0.29014] seconds [1000000] iterations (renew, cursor_open, cursor_close, reset) with [100] databases in [0.27981] seconds [1000000] iterations (renew, cursor_open, cursor_close, reset) with [1000] databases in [0.45234] seconds [1000000] iterations (renew, cursor_open, cursor_close, reset) with [10000] databases in [1.80193] seconds [1000000] iterations (renew, cursor_open, cursor_close, reset) with [100000] databases in [16.44472] seconds
lmdb original
[1000000] iterations (renew, cursor_open, cursor_close, reset) with [10] databases in [0.29650] seconds [1000000] iterations (renew, cursor_open, cursor_close, reset) with [100] databases in [0.45699] seconds [1000000] iterations (renew, cursor_open, cursor_close, reset) with [1000] databases in [4.16531] seconds [1000000] iterations (renew, cursor_open, cursor_close, reset) with [10000] databases in [49.69454] seconds [1000000] iterations (renew, cursor_open, cursor_close, reset) with [100000] databases in [561.20375] seconds
On 27/05/16 07:37, "Howard Chu" hyc@symas.com wrote:
Jürg Bircher wrote:
Hello
Improved handling for large number of databases
If interested let me know how to contribute.
Looks interesting, yes. I assume you have profiled the code before and after the suggested changes, please provide your profiling results.
Please read the Developer Guidelines. http://www.openldap.org/devel/contributing.html
Access newly opened database from another transaction
Sounds like an oddball case. Applications should open all their DBIs from a single thread and not start any other threads/transactions until all setup is completed.
Hope it is useful!
Thanks.
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
On 27/05/16 07:37, "Howard Chu" hyc@symas.com wrote:
Jürg Bircher wrote:
Hello
Improved handling for large number of databases
If interested let me know how to contribute.
Looks interesting, yes. I assume you have profiled the code before and after the suggested changes, please provide your profiling results.
Please read the Developer Guidelines. http://www.openldap.org/devel/contributing.html
Access newly opened database from another transaction
Sounds like an oddball case. Applications should open all their DBIs from a single thread and not start any other threads/transactions until all setup is completed.
Yes it is a simpler way to open all the databases at startup. However if the environment contains many databases which are not necessarily opened during the life time of the application it is an advantage to open them lazily. Any additional open database generates more overhead especially when using begin and abort.
lmdb improved (renew, reset)
[1000000] iterations (renew, cursor_open, cursor_close, reset) on 1 database with [1 of 10] open databases in [0.31975] seconds [1000000] iterations (renew, cursor_open, cursor_close, reset) on 1 database with [10 of 10] open databases in [0.20350] seconds
[1000000] iterations (renew, cursor_open, cursor_close, reset) on 1 database with [1 of 100] open databases in [0.25845] seconds [1000000] iterations (renew, cursor_open, cursor_close, reset) on 1 database with [100 of 100] open databases in [0.29663] seconds
[1000000] iterations (renew, cursor_open, cursor_close, reset) on 1 database with [1 of 1000] open databases in [0.28590] seconds [1000000] iterations (renew, cursor_open, cursor_close, reset) on 1 database with [1000 of 1000] open databases in [0.42897] seconds
[1000000] iterations (renew, cursor_open, cursor_close, reset) on 1 database with [1 of 10000] open databases in [0.30004] seconds [1000000] iterations (renew, cursor_open, cursor_close, reset) on 1 database with [10000 of 10000] open databases in [1.68870] seconds
lmdb improved (begin, commit)
[1000000] iterations (begin, cursor_open, cursor_close, abort) on 1 database with [1 of 10] open databases in [0.36538] seconds [1000000] iterations (begin, cursor_open, cursor_close, abort) on 1 database with [10 of 10] open databases in [0.35923] seconds
[1000000] iterations (begin, cursor_open, cursor_close, abort) on 1 database with [1 of 100] open databases in [0.34858] seconds [1000000] iterations (begin, cursor_open, cursor_close, abort) on 1 database with [100 of 100] open databases in [0.39294] seconds
[1000000] iterations (begin, cursor_open, cursor_close, abort) on 1 database with [1 of 1000] open databases in [0.40222] seconds [1000000] iterations (begin, cursor_open, cursor_close, abort) on 1 database with [1000 of 1000] open databases in [0.54752] seconds
[1000000] iterations (begin, cursor_open, cursor_close, abort) on 1 database with [1 of 10000] open databases in [0.78595] seconds [1000000] iterations (begin, cursor_open, cursor_close, abort) on 1 database with [10000 of 10000] open databases in [2.32414] seconds
lmdb original (renew, reset)
[1000000] iterations (renew, cursor_open, cursor_close, reset) on 1 database with [1 of 10] open databases in [0.18597] seconds [1000000] iterations (renew, cursor_open, cursor_close, reset) on 1 database with [10 of 10] open databases in [0.21572] seconds
[1000000] iterations (renew, cursor_open, cursor_close, reset) on 1 database with [1 of 100] open databases in [0.24173] seconds [1000000] iterations (renew, cursor_open, cursor_close, reset) on 1 database with [100 of 100] open databases in [0.46497] seconds
[1000000] iterations (renew, cursor_open, cursor_close, reset) on 1 database with [1 of 1000] open databases in [0.27127] seconds [1000000] iterations (renew, cursor_open, cursor_close, reset) on 1 database with [1000 of 1000] open databases in [4.18579] seconds
[1000000] iterations (renew, cursor_open, cursor_close, reset) on 1 database with [1 of 10000] open databases in [0.30128] seconds [1000000] iterations (renew, cursor_open, cursor_close, reset) on 1 database with [10000 of 10000] open databases in [49.35048] seconds
lmdb original (begin, commit)
[1000000] iterations (begin, cursor_open, cursor_close, abort) on 1 database with [1 of 10] open databases in [0.45135] seconds [1000000] iterations (begin, cursor_open, cursor_close, abort) on 1 database with [10 of 10] open databases in [0.38990] seconds
[1000000] iterations (begin, cursor_open, cursor_close, abort) on 1 database with [1 of 100] open databases in [0.47890] seconds [1000000] iterations (begin, cursor_open, cursor_close, abort) on 1 database with [100 of 100] open databases in [0.69917] seconds
[1000000] iterations (begin, cursor_open, cursor_close, abort) on 1 database with [1 of 1000] open databases in [1.84908] seconds [1000000] iterations (begin, cursor_open, cursor_close, abort) on 1 database with [1000 of 1000] open databases in [5.88098] seconds
[1000000] iterations (begin, cursor_open, cursor_close, abort) on 1 database with [1 of 10000] open databases in [22.12491] seconds [1000000] iterations (begin, cursor_open, cursor_close, abort) on 1 database with [10000 of 10000] open databases in [74.53854] seconds
Hope it is useful!
Thanks.
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
See ITS#8430 (Improved handling for large number of databases ) and ITS#8431 (Access newly opened database from another transaction)
I would appreciate if you also consider to add the second improvement as it is also performance relevant.
See former post in this thread for details and test results.
With kind regards
Rockethealth by Helmedica AG Web: www.rockethealth.ch Jürg Bircher Chief Technology Officer Mail: juerg.bircher@helmedica.ch
On 27/05/16 07:37, "Howard Chu" hyc@symas.com wrote:
Jürg Bircher wrote:
Hello
Improved handling for large number of databases
If interested let me know how to contribute.
Looks interesting, yes. I assume you have profiled the code before and after the suggested changes, please provide your profiling results.
Please read the Developer Guidelines. http://www.openldap.org/devel/contributing.html
Access newly opened database from another transaction
Sounds like an oddball case. Applications should open all their DBIs from a single thread and not start any other threads/transactions until all setup is completed.
Hope it is useful!
Thanks.
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
openldap-technical@openldap.org