On Tue, Jan 21, 2020 at 11:25 PM Howard Chu <hyc@symas.com> wrote:
Dmitri Shubin wrote:
> Hi,
>
> Our application uses LMDB and sometimes we need to perform scan of old ("cold") data.
> The problem here is that this data can/will remove "hot" data from page-cache.
>
> Ideally to prevent this from happening we'd like to somehow mark cursor as "streaming"/"non-temporal".
> But AFAICS there is no such feature currently.
>
> We're experimenting with marking pages returned by cursor with madvise(MADV_DONTNEED) and this at least prevents RSS from growing.
> But maybe there are some downsides of this approach?

You obviously can only mark the pages after you've used the result. If you mark it
before returning the data to the caller, some other thread may cause those pages
to be reused before you get a chance to see the data, causing a wasted re-fetch.

Yes, we remember the key/value pointer returned by cursor on the previous step and after newly returned key/value are on a new page we discard the previous one.
Our only concern is that on Linux madvise(MADV_DONTNEED) has destructive semantics and can throw away dirty pages.

But it seems we're safe since
 * we use such cursors only in read-only transactions (so cannot madvise malloc()'ed memory);
 * we don't use MDB_WRITEMAP (although we do use MAP_NOSYNC | MAP_NOMETASYNC).

______________________________

itiviti.com
 Follow Itiviti on Linkedin

The information contained in or attached to this email is strictly confidential. If you are not the intended recipient, please notify us immediately by telephone and return the message to us. Email communications by definition contain personal information. The Itiviti group of companies is subject to European data protection regulations.

Itiviti’s Privacy Notice is available at www.itiviti.com. Itiviti expects the recipient of this email to be compliant with Itiviti’s Privacy Notice and applicable regulations. Please advise us immediately at dataprotectionteam@Itiviti.com if you are not compliant with these.