hyc@OpenLDAP.org writes:
slapd-bdb.5 1.40 -> 1.41 Add dbpagesize keyword for configuring DB file page sizes
If I understand the DB doc correctly, a pagesize parameter which means "use the file system's block size" could also be useful, at least for data integrity. Unless that's already the default.
Hallvard B Furuseth wrote:
hyc@OpenLDAP.org writes:
slapd-bdb.5 1.40 -> 1.41 Add dbpagesize keyword for configuring DB file page sizes
If I understand the DB doc correctly, a pagesize parameter which means "use the file system's block size" could also be useful, at least for data integrity. Unless that's already the default.
The default in slapd is an explicit 4KB. I agree that defaulting to the underlying FS block size might be better, but I don't know if it's worth changing now.
Howard Chu writes:
The default in slapd is an explicit 4KB.
Duh, the patch I was replying to said that...
BTW, if back-bdb and dbpagesize measure pages in 1K block units, back-bdb must ensure Berkeley DB doesn't pick 512-byte blocks.
I agree that defaulting to the underlying FS block size might be better,
Actually that's stronger that I suggested. It's been a while since I read the tuning guides, but I thought it was a trade-off for what one cares most about. Performance, integrity, space. Maybe whether a database has many index keys with large ID lists.
Brainstorming a bit more: For a DB admin, I imagine it could be nice to allow "I don't care as long as you don't pick too small or too large pages". I.e. a pagesize range: dbpagesize [ * | file ] { integer | [integer]-[integer] } "*" = all files not overridden by a "file" configuration. pagesize = (filesys_pagesize <= range_min ? range_min : filesys_pagesize >= range_max ? range_max : filesys_pagesize);
I have no idea if that's useful enough to bother with though. If one tries to write a template for a "general" config, it might be better to write a script which dynamically creates one.
I wrote:
dbpagesize [ * | file ] { integer | [integer]-[integer] }
Typo: dbpagesize { * | file } { integer | [integer]-[integer] }
I have no idea if that's useful enough to bother with though. If one tries to write a template for a "general" config, it might be better to write a script which dynamically creates one.
Probably most useful if you have several "identical" servers on different OSes, with a common slapd.conf from CVS or whatever. But even then one needs two almost-identical configurations: for the master and for slaves. Unless they are all (multi-)masters.
Hallvard B Furuseth wrote:
I wrote:
dbpagesize [ * | file ] { integer | [integer]-[integer] }
Typo: dbpagesize { * | file } { integer | [integer]-[integer] }
I have no idea if that's useful enough to bother with though. If one tries to write a template for a "general" config, it might be better to write a script which dynamically creates one.
Probably most useful if you have several "identical" servers on different OSes, with a common slapd.conf from CVS or whatever. But even then one needs two almost-identical configurations: for the master and for slaves. Unless they are all (multi-)masters.
Yes, that's precisely the type of scenario that worries me the most. Which is one argument for keeping the default hardcoded to 4K instead of letting BDB decide. That eliminates any ambiguity, and means that you can safely tune them all identically, because the underlying assumptions would be the same.
Howard Chu writes:
Probably most useful if you have several "identical" servers on different OSes, with a common slapd.conf from CVS or whatever. But even then one needs two almost-identical configurations: for the master and for slaves. Unless they are all (multi-)masters.
Yes, that's precisely the type of scenario that worries me the most. Which is one argument for keeping the default hardcoded to 4K instead of letting BDB decide.
If that's worrisome it sounds like filename parameter = "*" would be useful, so one at least gets the same size for all files. And so one can configure "I want <fixed vs filesystem> blocksize" in one sweep.
That eliminates any ambiguity, and means that you can safely tune them all identically, because the underlying assumptions would be the same.
Whether that's a bug or a feature depends on your goals. One reason to use several OSes is redundancy. Some bugs/problems only show up on some OSes, compilers, or whatever.
In any case, one such assumption would differ: "database block size == filesystem blocksize". Which may not matter for the back-bdb code, but it matters for database administration and maybe the Berkeley DB code.
Hallvard B Furuseth wrote:
Howard Chu writes:
The default in slapd is an explicit 4KB.
Duh, the patch I was replying to said that...
BTW, if back-bdb and dbpagesize measure pages in 1K block units, back-bdb must ensure Berkeley DB doesn't pick 512-byte blocks.
If you set an explicit size, BDB doesn't get to pick, so that's irrelevant.
Besides, no worthwhile filesystems in common use today use 512 byte blocks. The only thing that would possibly use such a size is FAT12 on a 360K floppy disk. (As I recall, by the time we got to 1.2MB floppies everyone had already switched to FAT16...) Anything using 512 byte blocks is obviously not suitable for database use.
I agree that defaulting to the underlying FS block size might be better,
Actually that's stronger that I suggested. It's been a while since I read the tuning guides, but I thought it was a trade-off for what one cares most about. Performance, integrity, space. Maybe whether a database has many index keys with large ID lists.
Right.
Brainstorming a bit more: For a DB admin, I imagine it could be nice to allow "I don't care as long as you don't pick too small or too large pages". I.e. a pagesize range: dbpagesize [ * | file ] { integer | [integer]-[integer] } "*" = all files not overridden by a "file" configuration. pagesize = (filesys_pagesize<= range_min ? range_min : filesys_pagesize>= range_max ? range_max : filesys_pagesize);
I have no idea if that's useful enough to bother with though. If one tries to write a template for a "general" config, it might be better to write a script which dynamically creates one.
Most people simply shouldn't touch this setting. If "I don't care" is true, then "don't touch it."
Anyone who changes this setting should have read all of the relevant BDB docs first, to understand the risks and tradeoffs. At that point, they should choose a specific number, none of this fuzzy approximate range stuff.
People who are running into frequent DB overflow conditions probably should be redesigning their DIT, not messing with the DB page sizes...
Howard Chu writes:
BTW, if back-bdb and dbpagesize measure pages in 1K block units, back-bdb must ensure Berkeley DB doesn't pick 512-byte blocks.
If you set an explicit size, BDB doesn't get to pick, so that's irrelevant.
OK. I was thinking the other way: If back-bdb reads the setting and makes use of it for something. I guess not.
Besides, no worthwhile filesystems in common use today use 512 byte blocks.
Famous last words... Next FS revolutions will see - oh, blocks of blocks where the inner blocks often are small. Or whatever.
Brainstorming a bit more: For a DB admin, I imagine it could be nice to allow "I don't care as long as you don't pick too small or too large pages". I.e. a pagesize range:
(...) Anyone who changes this setting should have read all of the relevant BDB docs first, to understand the risks and tradeoffs. At that point, they should choose a specific number, none of this fuzzy approximate range stuff.
Fine by me.
Hallvard B Furuseth wrote:
Howard Chu writes:
Besides, no worthwhile filesystems in common use today use 512 byte blocks.
Famous last words... Next FS revolutions will see - oh, blocks of blocks where the inner blocks often are small. Or whatever.
Heh. Total digression now. But no, not at all.
This isn't like saying "no one will ever need more than 640K." Historical trends shows that resource sizes only ever increase, never decrease. Storage densities increase, capacities increase, and the overhead of managing the storage increases. The only way to keep a handle on such large capacities is to use larger and larger allocation units. The hard drive industry has been trying to push for 4KB sectors as a standard for years, because the command overhead with 512 byte sectors is too high. The SSD industry is using flash memory devices with erase blocks at 128KB and growing.
Filesystem block sizes can only increase, to keep up, otherwise performance will be abysmal. The only thing to be afraid of here is that BDB's 64K limit may already be too small for a lot of common technologies today.