Just recently measured: when doing a bulk-load of a DB that's larger than RAM, it's faster to turn off WRITEMAP and just use regular writes.
When the DB is much larger than RAM, and LMDB is reusing old pages, most likely the next page to be written will not currently be in memory. If you just access the mapped page (writing to it) the OS will have to page it in first. This is an unnecessary I/O operation since you're simply going to overwrite its contents anyway. If you do a regular write() from a buffer instead, the OS just writes it to the target page, no page-in required.
Strangely enough, this performance advantage disappears when under an active random read/write workload. I haven't yet worked out why that is. Perhaps the cost of multiple memcpy's comes into play.