[Issue 9735] New: [PATCH] try hard to find free space if database cannot grow

List overview All Threads
Download

newer

older

[Issue 10107] New: Update support...

[Issue 10102] New: DB_LOCK_DEADLOCK

openldap-its＠openldap.org

3 Nov 2021 3 Nov '21

1:51 a.m.

https://bugs.openldap.org/show_bug.cgi?id=9735

Issue ID: 9735 Summary: [PATCH] try hard to find free space if database cannot grow Product: LMDB Version: 0.9.24 Hardware: All OS: Linux Status: UNCONFIRMED Keywords: needs_review Severity: normal Priority: --- Component: liblmdb Assignee: bugs@openldap.org Reporter: libor.peltan@nic.cz Target Milestone: ---

Created attachment 851 --> https://bugs.openldap.org/attachment.cgi?id=851&action=edit Patch fixing the issue "try hard to find free space if database cannot grow"

Note: - the issue is the same in version 0.9.70 (git)

Situation: - the database had already grown to its limit (mapsize) in the past - overflow pages are used heavily as stored values are usually several pages long - free space got fragmented

Problem: - attempt to insert new value results in MDB_MAP_FULL despite there is free space available

Cause: there is a heursitic in mdb_page_alloc() that gives up searching for free space chunk if this would take too much time. This is useful when the database can still grow, as it balances performance with space usage. However, if the database can no longer grow, it prevents inserting new values.

Solution: detect early on in mdb_page_alloc() if the database can grow, and if not, let it try hard to search for free space.

Patch: attached

-- You are receiving this mail because: You are on the CC list for the issue.

Show replies by date

openldap-its＠openldap.org

11 Nov 11 Nov

8:37 a.m.

New subject: [Issue 9735] [PATCH] try hard to find free space if database cannot grow

https://bugs.openldap.org/show_bug.cgi?id=9735

Quanah Gibson-Mount quanah@openldap.org changed:

-- You are receiving this mail because: You are on the CC list for the issue.

openldap-its＠openldap.org

13 Jul 13 Jul

12:55 p.m.

New subject: [Issue 9735] [PATCH] try hard to find free space if database cannot grow

https://bugs.openldap.org/show_bug.cgi?id=9735

--- Comment #1 from libor.peltan@nic.cz --- BUMP, could you please look at this issue? It still hurts us. Thank you.

-- You are receiving this mail because: You are on the CC list for the issue.

openldap-its＠openldap.org

10 Jul 10 Jul

10:17 a.m.

New subject: [Issue 9735] [PATCH] try hard to find free space if database cannot grow

https://bugs.openldap.org/show_bug.cgi?id=9735

--- Comment #2 from jtgrassie jtg@xtrabass.com --- I wonder if this "try harder" approach is best activated via a flag on mdb_env_open? That way, only users who want the more aggressive search for free space get impacted.

-- You are receiving this mail because: You are on the CC list for the issue.

openldap-its＠openldap.org

26 Jul 26 Jul

7:59 a.m.

New subject: [Issue 9735] [PATCH] try hard to find free space if database cannot grow

https://bugs.openldap.org/show_bug.cgi?id=9735

--- Comment #3 from libor.peltan@nic.cz --- (In reply to jtgrassie from comment #2)

...

I wonder if this "try harder" approach is best activated via a flag on mdb_env_open? That way, only users who want the more aggressive search for free space get impacted.

I thought about it as well, but then I count'd find any use-case for the other (current) behavior. When the database can still grow, there is a clear trade-off between performance and occupied storage. But otherwise, should there be a trade-off between performance and (non-)functionality? Could you please think of an example where is it more desirable to return a failure than to continue trying hard (to find some free space)? Thank you!

-- You are receiving this mail because: You are on the CC list for the issue.

openldap-its＠openldap.org

11:58 a.m.

New subject: [Issue 9735] [PATCH] try hard to find free space if database cannot grow

https://bugs.openldap.org/show_bug.cgi?id=9735

--- Comment #4 from jtgrassie jtg@xtrabass.com ---

...

I thought about it as well, but then I count'd find any use-case for the other (current) behavior. When the database can still grow, there is a clear trade-off between performance and occupied storage. But otherwise, should there be a trade-off between performance and (non-)functionality? Could you please think of an example where is it more desirable to return a failure than to continue trying hard (to find some free space)? Thank you!

The trade-off isn't really between performance and (non-)functionality, rather about performance vs growth rate concerns. This is because you either do a quick freelist search, and if not found, you can grow the environment (which is fast; the current solve), vs doing an aggressive (slow) fuller search (this proposed patch), before potentially growing.

Hence if you don't really care about growth rate or are happy enough to occasionally do some maintenance when you do care (i.e. performing an `mdb_copy -c ...` to compact out the freelist), then the current approach is fine. But if you do care more about growth rate than performance, something like this patch is useful, because you're saying "hey, I don't care you might take a while looking through the freelist for free pages, I'd prefer that to growing unnecessarily".

This is why I think this is a good candidate for an env open flag, so the user can decide which matters more to them.

-- You are receiving this mail because: You are on the CC list for the issue.

openldap-its＠openldap.org

27 Jul 27 Jul

12:37 a.m.

New subject: [Issue 9735] [PATCH] try hard to find free space if database cannot grow

https://bugs.openldap.org/show_bug.cgi?id=9735

--- Comment #5 from libor.peltan@nic.cz --- @4jtgrassie

This is a misunderstanding.

My proposed patch only performs an aggressive (slow) fuller search if the database can NOT grow, since it had already grown to its limit (mapsize).

Hence, it's really a trade-off between performance and functionality: with the patch, the database tries hard to find any available free space; without the patch (or with it deactivated by some env flag), it tries a little and than fails with MDB_MAP_FULL. Can this be desirable in some use-case?

When the database is not full and the environment has the possibility to grow, my patch does not change anything. The performance/size trade-off coefficient remains unchanged. (I believe it has been carefully chosen based on experience, and it's possible that some users might want to configure it differently, but I'd like to avoid this topic completely. It would be too arguable.)

-- You are receiving this mail because: You are on the CC list for the issue.

openldap-its＠openldap.org

4:19 a.m.

New subject: [Issue 9735] [PATCH] try hard to find free space if database cannot grow

https://bugs.openldap.org/show_bug.cgi?id=9735

--- Comment #6 from jtgrassie jtg@xtrabass.com ---

...

This is a misunderstanding.

My proposed patch only performs an aggressive (slow) fuller search if the database can NOT grow, since it had already grown to its limit (mapsize).

No, I do understand this. The issue is all about determining *if* there is any free space (if the environment really is full or not).

...

Hence, it's really a trade-off between performance and functionality: with the patch, the database tries hard to find any available free space; without the patch (or with it deactivated by some env flag), it tries a little and than fails with MDB_MAP_FULL. Can this be desirable in some use-case?

It's not a "functionality" trade-off because as I mentioned, one can resize (grow) the environment, which is fast (`mdb_env_set_mapsize`). Hence the trade-off is performance vs growth rate concerns.

-- You are receiving this mail because: You are on the CC list for the issue.

openldap-its＠openldap.org

29 Jul 29 Jul

1:34 a.m.

New subject: [Issue 9735] [PATCH] try hard to find free space if database cannot grow

https://bugs.openldap.org/show_bug.cgi?id=9735

--- Comment #7 from libor.peltan@nic.cz --- @jtgrassie Well, for the sake of clarity, let's distinguish:

- growth(1): increasing the LMDB data file size within the limit of pre-configured (and static) Mapsize - growth(2): increasing the Mapsize

Do you suggest that when the free space in given env is not found (e.g. due to free space fragmentation), it's desirable to grow(2) ? I guess the code of LMDB internally never does this, but it's of course possible to implement in the application some logic like:

- try to commit a RW txn - if MDB_MAP_FULL is encountered, grow(2) and re-try

...but there would always need to be _some_ limit (e.g. available disk space). Could you discuss why wouldn't one configure the Mapsize to _that_ very maximal limit already at the beginning and rely only on growth(1) which is much less complicated?

From my point of view, the situation looks quite simple:

- set the Mapsize to the highest possible limit - the DB will probably never reach it, enjoying the performance-size-trade-off when searching for available free chunk - when the fragmentation goes so bad that it actually reaches the limit, it's always better to sacrifice the performance than failing to write data (with no possibility of any growth)

But if you could sketch different reasonable operation modes, i'd be glad :)

-- You are receiving this mail because: You are on the CC list for the issue.

openldap-its＠openldap.org

8:57 p.m.

New subject: [Issue 9735] [PATCH] try hard to find free space if database cannot grow

https://bugs.openldap.org/show_bug.cgi?id=9735

--- Comment #8 from jtgrassie jtg@xtrabass.com ---

...

growth(1): increasing the LMDB data file size within the limit of

pre-configured (and static) Mapsize

This is not part of the discussion, LMDB already does this. The question is about how hard to look for free space once you've grown to the bounds.

...

growth(2): increasing the Mapsize

This is an option when you get a MDB_MAP_FULL from a write tx. As mentioned, using `mdb_env_set_mapsize`.

...

Do you suggest that when the free space in given env is not found (e.g. due to free space fragmentation), it's desirable to grow(2) ? I guess the code of LMDB internally never does this, but it's of course possible to implement in the application some logic like:

try to commit a RW txn

if MDB_MAP_FULL is encountered, grow(2) and re-try

That's exactly the pattern I described, and is used by various applications.

...

Could you discuss why wouldn't one configure the Mapsize to _that_ very maximal limit already at the beginning

There are many reasons why one wouldn't set the initial size to that of *all* available disk space.

...

From my point of view, the situation looks quite simple:

set the Mapsize to the highest possible limit

the DB will probably never reach it, enjoying the performance-size-trade-off

when searching for available free chunk

when the fragmentation goes so bad that it actually reaches the limit, it's

always better to sacrifice the performance than failing to write data (with no possibility of any growth)

The database can be compacted *outside* of the main application (as I mentioned, performing an `mdb_copy -c ...`) which can be done as part of a maintenance window for example, therefore never sacrificing the performance.

It all comes down to your specific application requirements, hardware and use-case. Which is exactly *why* I suggested your patch be enabled via a flag (an option) on `mdb_env_open`. I can appreciate some may prefer a "try harder" approach to look for free space (accepting potentially slow writes as a consequence), but I think you also need to appreciate others may not want this (and thus solve fragmentation differently).

-- You are receiving this mail because: You are on the CC list for the issue.

openldap-its＠openldap.org

21 Sep 21 Sep

8:40 a.m.

New subject: [Issue 9735] [PATCH] try hard to find free space if database cannot grow

https://bugs.openldap.org/show_bug.cgi?id=9735

--- Comment #9 from libor.peltan@nic.cz --- OK, I admit that there might be applications that routinely increase mapsize whenever MDB_MAP_FULL is throwed and they are happy with it.

For us, it would be nice if we could achieve our desired behavior (i.e. LMDB tries hard to find free space if mapsize had been reached) anyhow, say by switching any magic flag.

How can I help LMDB authors/maintainers to implement this? Shall I upload a modified patch, or leave it on their (perhaps more polished) coding style?

-- You are receiving this mail because: You are on the CC list for the issue.

openldap-its＠openldap.org

9:06 a.m.

New subject: [Issue 9735] [PATCH] try hard to find free space if database cannot grow

https://bugs.openldap.org/show_bug.cgi?id=9735

Howard Chu hyc@openldap.org changed:

--- Comment #10 from Howard Chu hyc@openldap.org --- Not particularly interested in this approach because with freelist fragmentation, it will frequently be futile. The current code was arrived at after hundreds of hours of testing, we know from experience that minor tweaks like this won't help much.

The way forward is to optimize the in-memory handling of the freelist, the way BoltDB did here https://github.com/etcd-io/bbolt/pull/138

Patches to port their approach into LMDB are welcome. Their patch is simple because they use built in hashtable primitives in Go. A bit more work is needed to do the same here. I recommend using http://www.tommyds.it/ as a starting point.

-- You are receiving this mail because: You are on the CC list for the issue.

openldap-its＠openldap.org

10:20 a.m.

New subject: [Issue 9735] [PATCH] try hard to find free space if database cannot grow

https://bugs.openldap.org/show_bug.cgi?id=9735

--- Comment #11 from Howard Chu hyc@openldap.org --- Also worth reading http://www.nothings.org/computer/judy/ but that's the article that led me to TommyDS.

-- You are receiving this mail because: You are on the CC list for the issue.

openldap-its＠openldap.org

30 Sep 30 Sep

11:28 p.m.

New subject: [Issue 9735] [PATCH] try hard to find free space if database cannot grow

https://bugs.openldap.org/show_bug.cgi?id=9735

Howard Chu hyc@openldap.org changed:

-- You are receiving this mail because: You are on the CC list for the issue.

650

Age (days ago)

1347

Last active (days ago)

openldap-bugs@openldap.org

13 comments

1 participants

tags (0)

participants (1)

openldap-its＠openldap.org