Just recently measured: when doing a bulk-load of a DB that's larger than RAM, it's faster to turn off WRITEMAP and just use regular writes.
When the DB is much larger than RAM, and LMDB is reusing old pages, most likely the next page to be written will not currently be in memory. If you just access the mapped page (writing to it) the OS will have to page it in first. This is an unnecessary I/O operation since you're simply going to overwrite its contents anyway. If you do a regular write() from a buffer instead, the OS just writes it to the target page, no page-in required.
Strangely enough, this performance advantage disappears when under an active random read/write workload. I haven't yet worked out why that is. Perhaps the cost of multiple memcpy's comes into play.
Howard Chu hyc@symas.com schrieb am 06.11.2014 um 14:42 in Nachricht
Just recently measured: when doing a bulk-load of a DB that's larger than RAM, it's faster to turn off WRITEMAP and just use regular writes.
Doesn't that depend on the ratio of change compared to the size of RAM or DB?
When the DB is much larger than RAM, and LMDB is reusing old pages, most likely the next page to be written will not currently be in memory. If you just access the mapped page (writing to it) the OS will have to page it in first. This is an unnecessary I/O operation since you're simply going to
This would mean LMDB has a copy of the mmapped file; otherwise the OS has to page in some memory to provide a buffer for the data to be written anyway. If you copy the mmapped file to/from private buffers, the main advantage of mmapping a file goes away.
overwrite its contents anyway. If you do a regular write() from a buffer instead, the OS just writes it to the target page, no page-in required.
The OS still may have to allocate a buffer for the transfer, causing some paging activity.
Strangely enough, this performance advantage disappears when under an active
random read/write workload. I haven't yet worked out why that is. Perhaps the cost of multiple memcpy's comes into play.
Singe sync random writes, or async random writes? If sync writes, have you examined using writev() instead to write multiple blocks? When talking about writes and blocks: Do the blocks written match the blocksize and alignmant of the filesystem, and is the blocksize of the filesystem at least one physical sector of the medium, and are the blocks of the filesystem aligned with the medium (talking about now popular 4kB sectors)?
Regards, Ulrich
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
Ulrich Windl wrote:
Howard Chu hyc@symas.com schrieb am 06.11.2014 um 14:42 in Nachricht
Just recently measured: when doing a bulk-load of a DB that's larger than RAM, it's faster to turn off WRITEMAP and just use regular writes.
Doesn't that depend on the ratio of change compared to the size of RAM or DB?
Read and think before you post. You have no idea WTH you're talking about, but you could easily have read the code or docs or papers before posting and actually known something.
Howard Chu hyc@symas.com schrieb am 07.11.2014 um 11:16 in Nachricht
Ulrich Windl wrote:
Howard Chu hyc@symas.com schrieb am 06.11.2014 um 14:42 in Nachricht
Just recently measured: when doing a bulk-load of a DB that's larger than RAM, it's faster to turn off WRITEMAP and just use regular writes.
Doesn't that depend on the ratio of change compared to the size of RAM or
DB?
Read and think before you post. You have no idea WTH you're talking about, but you could easily have read the code or docs or papers before posting and actually known something.
I cannot follow your reasoning why you think that I don't think, and what convinced you that I have no idea what I'm taking about.
Wasn't it you who made some vague performance statement/recommendation without presenting any numbers or details on the tests one might assume you have made before writing?
I also don't see why I should read your code just because you made some statements I cannot follow.
Finally, I (and probably others as well) would prefer not seeing this type of reply, because it doesn't help anybody, ... (the rest censored by myself)...
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
Ulrich Windl wrote:
Howard Chu hyc@symas.com schrieb am 07.11.2014 um 11:16 in Nachricht
Ulrich Windl wrote:
Howard Chu hyc@symas.com schrieb am 06.11.2014 um 14:42 in Nachricht
Just recently measured: when doing a bulk-load of a DB that's larger than RAM, it's faster to turn off WRITEMAP and just use regular writes.
Doesn't that depend on the ratio of change compared to the size of RAM or
DB?
Read and think before you post. You have no idea WTH you're talking about, but you could easily have read the code or docs or papers before posting and actually known something.
I cannot follow your reasoning why you think that I don't think, and what convinced you that I have no idea what I'm taking about.
Wasn't it you who made some vague performance statement/recommendation without presenting any numbers or details on the tests one might assume you have made before writing?
There is nothing vague about "bulk-load a DB that's larger than RAM."
I also don't see why I should read your code just because you made some statements I cannot follow.
If you aren't going to do the basic homework then you have nothing to contribute.
openldap-technical@openldap.org