Hi,
I know this is difficult to answer, but is the mdb backend as it comes in 2.4.32 ready for a productive master-master setup with somewhat less tham 1mio entries? slapd-mdb(5) states it's an early release and that incompatible changes may occur, but on the other hand hdb changes disk format from time to time too. So, what are the opinions?
The setup will be extensively tested, but if mdb should not be ready yet I could skip the tests and concentrate on hdb.
Thanks, Karsten
Karsten Heymann wrote:
Hi,
I know this is difficult to answer, but is the mdb backend as it comes in 2.4.32 ready for a productive master-master setup with somewhat less tham 1mio entries? slapd-mdb(5) states it's an early release and that incompatible changes may occur, but on the other hand hdb changes disk format from time to time too. So, what are the opinions?
The setup will be extensively tested, but if mdb should not be ready yet I could skip the tests and concentrate on hdb.
The MDB library has not yet been optimized for write performance. I expect that there are still some changes for it coming, because it's clear we can still push it further.
But feature-wise back-mdb is complete, and we know of several companies that have tested it heavily for their own purposes and are deploying it in production.
Hi Howard,
2012/8/15 Howard Chu hyc@symas.com:
Karsten Heymann wrote:
I know this is difficult to answer, but is the mdb backend as it comes in 2.4.32 ready for a productive master-master setup with somewhat less tham 1mio entries?
[...]
But feature-wise back-mdb is complete, and we know of several companies that have tested it heavily for their own purposes and are deploying it in production.
Thanks. I will include it in our tests.
Regards Karsten
Hi,
On Wednesday, 15. August 2012, Howard Chu wrote:
The MDB library has not yet been optimized for write performance. I expect that there are still some changes for it coming, because it's clear we can still push it further.
But feature-wise back-mdb is complete, and we know of several companies that have tested it heavily for their own purposes and are deploying it in production.
Howard's reassuring answer made me wnt to switch from HDB to MDB too.
But this brought up a question: Given an existing HDB database, is there a formula or something to calculate the 'maxsize' config option of MDB from the existing information.
Thanks in advance for your answers.
Best regards Peter
Peter Marschall wrote:
Hi,
On Wednesday, 15. August 2012, Howard Chu wrote:
The MDB library has not yet been optimized for write performance. I expect that there are still some changes for it coming, because it's clear we can still push it further.
But feature-wise back-mdb is complete, and we know of several companies that have tested it heavily for their own purposes and are deploying it in production.
Howard's reassuring answer made me wnt to switch from HDB to MDB too.
But this brought up a question: Given an existing HDB database, is there a formula or something to calculate the 'maxsize' config option of MDB from the existing information.
In my testing, MDB typically uses about 60% as much space as HDB. Factor in however much future growth you anticipate and go from there.
As an example, I have a test LDIF that's 558694630 bytes, containing 380836 entries. Looking at info from [m]db_stat, we can compare the number of pages used for each index:
hdb mdb branch leaf overflow branch leaf overflow dn2id 328 8097 0 67 7625 0 id2e 344 249856 59368 263 29681 293169 oc 11 154 0 1 3 0 uid 2487 26392 0 65 10895 0
(page sizes are normalized here; hdb id2entry uses 16K pages while all other databases use 4K pages)
This is with index objectclass eq index uid eq,sub
The dn2id, oc, and uid database formats are logically identical between hdb and mdb, so the difference in size is due to the difference in BDB and MDB. The id2entry database in mdb uses a slightly different encoding than hdb, so there are both library and backend format differences there.
As you can see, the more indexing you use, the bigger the difference between mdb and hdb.
Hi,
On Saturday, 18. August 2012, Howard Chu wrote:
Peter Marschall wrote:
But this brought up a question: Given an existing HDB database, is there a formula or something to calculate the 'maxsize' config option of MDB from the existing information.
In my testing, MDB typically uses about 60% as much space as HDB. Factor in however much future growth you anticipate and go from there.
As an example, I have a test LDIF that's 558694630 bytes, containing 380836 entries. Looking at info from [m]db_stat, we can compare the number of pages used for each index:
hdb mdb
branch leaf overflow branch leaf overflow dn2id 328 8097 0 67 7625 0 id2e 344 249856 59368 263 29681 293169 oc 11 154 0 1 3 0 uid 2487 26392 0 65 10895 0
(page sizes are normalized here; hdb id2entry uses 16K pages while all other databases use 4K pages)
This is with index objectclass eq index uid eq,sub
The dn2id, oc, and uid database formats are logically identical between hdb and mdb, so the difference in size is due to the difference in BDB and MDB. The id2entry database in mdb uses a slightly different encoding than hdb, so there are both library and backend format differences there.
As you can see, the more indexing you use, the bigger the difference between mdb and hdb.
Thanks for the explanation, Howard.
I may be a bit thick today, but I do not see, how I can determine a minimal value for MDB's *maxsize* parameter from the values given above. (I do not want to waste memory ;-)
Shall I simply take the LDIF size as maxsize? Shall I simply take the added sizes of the files in the HDB database? ...
Thanks in advance EPter
Peter Marschall wrote:
Hi,
On Saturday, 18. August 2012, Howard Chu wrote:
Peter Marschall wrote:
But this brought up a question: Given an existing HDB database, is there a formula or something to calculate the 'maxsize' config option of MDB from the existing information.
In my testing, MDB typically uses about 60% as much space as HDB. Factor in however much future growth you anticipate and go from there.
As an example, I have a test LDIF that's 558694630 bytes, containing 380836 entries. Looking at info from [m]db_stat, we can compare the number of pages used for each index:
hdb mdb
branch leaf overflow branch leaf overflow dn2id 328 8097 0 67 7625 0 id2e 344 249856 59368 263 29681 293169 oc 11 154 0 1 3 0 uid 2487 26392 0 65 10895 0
(page sizes are normalized here; hdb id2entry uses 16K pages while all other databases use 4K pages)
This is with index objectclass eq index uid eq,sub
The dn2id, oc, and uid database formats are logically identical between hdb and mdb, so the difference in size is due to the difference in BDB and MDB. The id2entry database in mdb uses a slightly different encoding than hdb, so there are both library and backend format differences there.
As you can see, the more indexing you use, the bigger the difference between mdb and hdb.
Thanks for the explanation, Howard.
I may be a bit thick today, but I do not see, how I can determine a minimal value for MDB's *maxsize* parameter from the values given above. (I do not want to waste memory ;-)
Your comment makes no sense. There is no memory being wasted. Setting the size of the memory map only reserves the address space, and sets an upper limit on the size of the DB file on disk. If the DB never grows to the size you configure, then who cares. There is no point to minimizing the configured size, you're not saving anything.
Set it to a few hundred GB. If the DB only contains 1MB of data then that's all it will use. You only need to be concerned if in fact your DB will grow large enough to overflow your disks. Another possibility is that you're running multiple databases at once, and you've given them a terabyte each; you'll only be able to create 100 or so databases before you start impacting on shared library address space, etc. (On a contemporary 64 bit CPU with only 48 bit virtual address space.)
I'm pretty sure, if you're making comments like this, that this scenario doesn't apply to you.
openldap-technical@openldap.org