LMDB: iterating over the whole DB

List overview All Threads
Download

newer

older

user | this expansion in acl

OpenLDAP and Ansible

Vladimír Čunát

19 Aug 2020 19 Aug '20

9:21 a.m.

Hello. (I sent the same message 24h ago before subscribing, but it hasn't arrived so far. I hope you won't see a duplicate later.)

We have a process that needs to analyze the contents of a whole LMDB. So far the approach was to open a read-only transaction and use a cursor to iterate over the whole contents in order. This long transaction apparently causes other concurrent process(es) to get MDB_MAP_FULL from mdb_put() even though that LMDB has plenty free pages at that moment.

That would be a big problem for us, but fortunately we don't need the analysis to be atomic and splitting that iterating into smaller transactions seems to avoid the problem. That were just experiments without understanding why exactly it happens (though I get that long transactions are generally problematic).

Could this behavior be considered a bug in LMDB? Is there a way of either distinguishing this MDB_MAP_FULL event from really full LMDB or avoiding it with confidence? (We only experimented with the number of keys accessed in one transaction.)

By the way, thanks a lot for LMDB!

--Vladimir https://knot-resolver.cz

Show replies by date

Howard Chu

19 Aug 19 Aug

10:11 a.m.

Vladimír Čunát wrote:

...

Hello. (I sent the same message 24h ago before subscribing, but it hasn't arrived so far. I hope you won't see a duplicate later.)

We have a process that needs to analyze the contents of a whole LMDB. So far the approach was to open a read-only transaction and use a cursor to iterate over the whole contents in order. This long transaction apparently causes other concurrent process(es) to get MDB_MAP_FULL from mdb_put() even though that LMDB has plenty free pages at that moment.

That would be a big problem for us, but fortunately we don't need the analysis to be atomic and splitting that iterating into smaller transactions seems to avoid the problem. That were just experiments without understanding why exactly it happens (though I get that long transactions are generally problematic).

Could this behavior be considered a bug in LMDB?

It's a documented feature.

...

Is there a way of either distinguishing this MDB_MAP_FULL event from really full LMDB or avoiding it with confidence? (We only experimented with the number of keys accessed in one transaction.)

Maybe in LMDB 1.x we'll have a way to minimize the problem. For now, this is how it works.

...

By the way, thanks a lot for LMDB!

--Vladimir https://knot-resolver.cz

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Vladimír Čunát

20 Aug 20 Aug

2:45 a.m.

On 8/19/20 7:11 PM, Howard Chu wrote:

...

...
We have a process that needs to analyze the contents of a whole LMDB. So far the approach was to open a read-only transaction and use a cursor to iterate over the whole contents in order. This long transaction apparently causes other concurrent process(es) to get MDB_MAP_FULL from mdb_put() even though that LMDB has plenty free pages at that moment.

[...]

It's a documented feature.

I haven't seen that in docs http://www.lmdb.tech/doc/ (I even skimmed the "internals" section) Am I missing some part? It's just stated that _put functions can fail with MDB_MAP_FULL if "the database is full" (and I wouldn't consider it "full" in my case).

If that long transaction was getting MDB_TXN_FULL (or another special error), that would be more understandable to me, but those transactions appear OK and other short ones are getting MDB_MAP_FULL. Right now I have no indication *when* to break those read-only transactions, except for guessing based on experiments.

On 8/20/20 9:05 AM, Ulrich Windl wrote:

...

if you can measure the "fullness" of LMDB while you read, you could suspend/resume the reading in the way I described at the beginning before LMDB gets too full.

I have no idea how to measure *this* kind of fullness. Our estimate shows there should be many free pages (even half of all)

MDB_stat st; // and use mdb_stat() size_t pgs_used = st.ms_branch_pages + st.ms_leaf_pages + st.ms_overflow_pages;

and this estimate won't change at all when done during the read-only transaction. So the only thing to do that I can see is at the RW transaction living in a different process: if a write gives me MDB_MAP_FULL and the fullness estimate shows that a large fraction of pages should be free, I might assume that the error will just disappear soon (when the other process finishes its RO transaction). That's not nice at all.

--Vladimir

Howard Chu

9:25 a.m.

Vladimír Čunát wrote:

...

On 8/19/20 7:11 PM, Howard Chu wrote:

...
...
We have a process that needs to analyze the contents of a whole LMDB. So far the approach was to open a read-only transaction and use a cursor to iterate over the whole contents in order. This long transaction apparently causes other concurrent process(es) to get MDB_MAP_FULL from mdb_put() even though that LMDB has plenty free pages at that moment.

[...]

It's a documented feature.

I haven't seen that in docs http://www.lmdb.tech/doc/%C2%A0 (I even skimmed the "internals" section) Am I missing some part?

You definitely need to read more carefully. It's on the very first page of your link, under Caveats.

"Avoid long-lived transactions." ...

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Vladimír Čunát

10:42 a.m.

On 8/20/20 6:25 PM, Howard Chu wrote:

...

Vladimír Čunát wrote:

...
On 8/19/20 7:11 PM, Howard Chu wrote:

...
It's a documented feature.

I haven't seen that in docs http://www.lmdb.tech/doc/%C2%A0 (I even skimmed the "internals" section) Am I missing some part?

You definitely need to read more carefully. It's on the very first page of your link, under Caveats.

"Avoid long-lived transactions." ...

Right, thanks, I did miss the title page when re-reading.

It's a bit confusing that the paragraph after that sentence does not appear to apply to the observed case, but that's not really important... from this discussion I see no other option than shortening the transactions to some magical constant.

(Pages touched in a RO transaction can't be freed until it aborts - that was always clear. For us the overall number of pages don't seem possible to have ran out, assuming the MDB_stat counts reflected their usage before all those operations began.)

--Vladimir

Ondřej Kuzník

1:01 p.m.

On Thu, Aug 20, 2020 at 07:42:06PM +0200, Vladimír Čunát wrote:

...

Right, thanks, I did miss the title page when re-reading.

It's a bit confusing that the paragraph after that sentence does not appear to apply to the observed case, but that's not really important... from this discussion I see no other option than shortening the transactions to some magical constant.

(Pages touched in a RO transaction can't be freed until it aborts - that was always clear. For us the overall number of pages don't seem possible to have ran out, assuming the MDB_stat counts reflected their usage before all those operations began.)

I don't remember where to point you so in my own words...

You do understand that no pages reachable by an open read transaction can ever be reclaimed. At the moment, AFAIK a stronger claim can be made about LMDB:

For an open read transaction #n, it avoids reclaiming any pages freed by transactions #n upwards. Keeping a read transaction open indefinitely, at some point you are going to run out of pages to reclaim having to reach for fresh ones. Later, you might run out of those too. This is what you're seeing.

Regards,

-- Ondřej Kuzník Senior Software Engineer Symas Corporation http://www.symas.com Packaged, certified, and supported LDAP solutions powered by OpenLDAP

Vladimír Čunát

31 Aug 31 Aug

2:34 a.m.

I'm sorry for the delay; I've been busy, mainly with related fixes.

On 8/20/20 10:01 PM, Ondřej Kuzník wrote:

...

On Thu, Aug 20, 2020 at 07:42:06PM +0200, Vladimír Čunát wrote:

...
(Pages touched in a RO transaction can't be freed until it aborts - that was always clear. For us the overall number of pages don't seem possible to have ran out, assuming the MDB_stat counts reflected their usage before all those operations began.)

I don't remember where to point you so in my own words...

You do understand that no pages reachable by an open read transaction can ever be reclaimed. At the moment, AFAIK a stronger claim can be made about LMDB:

For an open read transaction #n, it avoids reclaiming any pages freed by transactions #n upwards. Keeping a read transaction open indefinitely, at some point you are going to run out of pages to reclaim having to reach for fresh ones. Later, you might run out of those too. This is what you're seeing.

No... the numbers do NOT match that notion at all. Now I had time to get more confidence by running proper tests as follows.

Let's have a large LMDB and two processes accessing it. One process opens the DB, iterates over whole contents in a single RO transaction (which always takes only a fraction of second), closes the DB (to be sure), waits a split-second and then repeats it whole.

The other process is continuously inserting into the DB in very small RW transactions. The speed of this is 1-2% DB size per second. There are no deletions and in this case even replacements should be rare (as DB keys are generated randomly). Around 50% usage this gets MDB_MAP_FULL; at that point the process stops and the DB shows state posted below.

All size measurements are occupied pages, as described earlier in the thread. As stated before, shortening the RO transactions delays MDB_MAP_FULL until >99% usage... and that's our workaround for now (with some magical constant).

Thanks --Vladimir

$ mdb_stat -ef .; and ls -l Environment Info Map address: (nil) Map size: 671088640 Page size: 4096 Max pages: 163840 Number of pages used: 163824 Last transaction ID: 115523 Max readers: 126 Number of readers used: 0 Freelist Status Tree depth: 2 Branch pages: 1 Leaf pages: 203 Overflow pages: 0 Entries: 3321 Free pages: 83500 Status of Main DB Tree depth: 5 Branch pages: 4162 Leaf pages: 75956 Overflow pages: 0 Entries: 627631

1765

Age (days ago)

1777

Last active (days ago)

openldap-technical@openldap.org

6 comments

3 participants

tags (0)

participants (3)

Howard Chu
Ondřej Kuzník
Vladimír Čunát