Re: Fwd: multiple sequential lmdb readers + spinning media = slow / thrashes?

27 Feb 2015


      Matthew Moskewicz wrote:
...
warnings: new to list, first post, lmdb noob.
i'm a caffe user:
https://github.com/BVLC/caffe
in one use case, caffe sequentially streams though >100GB lmdbs at a
rate of ~30MB/s in blocks of about 40MB. however, if multiple caffe
processes are reading the same lmdb (opened with MDB_RDONLY), read
performance becomes limiting (i.e. the processes become IO bound), even
though the disk has sufficient read bandwidth (say ~180MB/s). some of
the relevant caffe lmdb code is here:
https://github.com/BVLC/caffe/blob/master/src/caffe/util/db.cpp
however, if i *both*

run  blockdev --setra 65536 --setfra 65536 /dev/sdwhatever
modify lmdb to call posix_madvise(env->me_map, env->me_mapsize,

POSIX_MADV_SEQUENTIAL);
then i can get >1 reader to run without being IO limited.
This is quite timing-dependent - if you start your multiple readers at exactly the same time and they run at exactly the same speed, then they will all be using the same cached pages and all of the readers can run at the full bandwidth of the disk. If they're staggered or not running in lockstep, then you'll only get partial performance.
...
for (2), see https://github.com/moskewcz/scratch/tree/lmdb_seq_read_opt
similarly, using a sequential read microbenchmark designed to model the
caffe reads from here:
https://github.com/moskewcz/boda/blob/master/src/lmdbif.cc
if i run one reader, i get 180MB/s bandwidth.
with two readers, but neither (1) nor (2) above, each gets ~30MB/s
bandwidth.
with (1) and (2) enabled, and two readers, each gets ~90MB/s bandwidth.
The other point to note is that sequential reads in LMDB won't remain truly sequential (as seen by the storage device) after a few rounds of inserts/deletes/updates. Once you get any element of seek/random I/O in here your madvise will be useless.
...
any advice?
mwm
PS: backstory (skippable):
caffe originally used LevelDB to get better read performance for
sequentially loading sets of ~1M 227x227x3 raw images (~200GB data).
typically processing time is ~2 hours for this data set size, yielding a
read BW need of 30MB/s or so. it's not really clear if/why LevelDB was
uses aside from the fact that the caffe author was a google intern at
the time he wrote it, but anecdotally i think the claim is that reading
the raw .jpgs had perf. issues, although it's unclear exactly what or
why. i guess it was the usual story about not getting sequential reads
without using LevelDB. they switched to lmdb a while back.
mailto:openldap-devel@openldap.org
-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: Fwd: multiple sequential lmdb readers + spinning media = slow / thrashes?