The documentation for the `mdb_cursor_del` function doesn't indicate what happens to the state of the cursor after the targeted object is deleted.
This makes it difficult to reason about, for example, the task of scanning through the database and deleting keys or values according to some criterion.
What does MDB_NEXT mean on a cursor after deletion? Will it return the next element, or should I use MDB_GET_CURRENT for the next element? Or will these operations return in error, and I should use MDB_SET_RANGE with the deleted key?
Similarly, what happens to every other cursor pointing to the same key? I see you have some code that touches them, but no documentation to tell the user's what is happening.
This needs to be documented. Iterator invalidation is painful in C++ where it is thoroughly documented. When it's reduced to guesswork or testing without a clear indication of intended behavior, it's worse.
David Barbour wrote:
The documentation for the `mdb_cursor_del` function doesn't indicate what happens to the state of the cursor after the targeted object is deleted.
This makes it difficult to reason about, for example, the task of scanning through the database and deleting keys or values according to some criterion.
What does MDB_NEXT mean on a cursor after deletion? Will it return the next element, or should I use MDB_GET_CURRENT for the next element? Or will these operations return in error, and I should use MDB_SET_RANGE with the deleted key?
Similarly, what happens to every other cursor pointing to the same key? I see you have some code that touches them, but no documentation to tell the user's what is happening.
This needs to be documented. Iterator invalidation is painful in C++ where it is thoroughly documented. When it's reduced to guesswork or testing without a clear indication of intended behavior, it's worse.
The behavior wasn't settled, in the beginning. But you can read the mtest source code to see what the expected behavior is.
In an open source project, you are expected to read source code to understand what's going on.
Things have settled down enough now to formally document it though.
The point of using cursors is to retain state about a position in the DB, so it can be reused for a subsequent operation. This is the guiding principle. Otherwise cursors would have no reason to exist.
After a cursor_del, the cursor's position is unchanged - it points to the same page and node slot as it did before. But, since the node it pointed to was deleted, the actual data it points to changes. I.e., all nodes in the page are moved down 1 slot to occupy the vacated slot, so it points to whatever was the next item.
Both MDB_NEXT and MDB_GET_CURRENT will return the same record after cursor_del. This simplifies loops where you just want to delete all the records in a range - just use MDB_NEXT like any other loop iteration.
The cursor's actual position can change if the deletion causes a page merge. But its relative position in the set of data will still be the same.
On Wed, Feb 4, 2015 at 7:13 PM, Howard Chu hyc@symas.com wrote:
The behavior wasn't settled, in the beginning. But you can read the mtest source code to see what the expected behavior is.
In an open source project, you are expected to read source code to understand what's going on.
I have been.
Though, in this particular case, I needed to dig all the way down to LMDB page layout to grok key indexes, compare to mdb_cursor_next, search for instances of C_DEL, etc. to understand what was happening.
Reading code works better when we have something more modular and compositional, i.e. such that you can read and understand in small pieces. LMDB is just one big module, you can't understand most pieces without seeing the whole picture. This isn't a bad thing for its purpose, but it does mean good API documentation saves hours or days of reading code for someone who wants to just use LMDB.
Things have settled down enough now to formally document it though.
Then make it so. It will save a lot of people a lot of time. :)
Regards,
Dave
David Barbour wrote:
On Wed, Feb 4, 2015 at 7:13 PM, Howard Chu <hyc@symas.com mailto:hyc@symas.com> wrote:
The behavior wasn't settled, in the beginning. But you can read the mtest source code to see what the expected behavior is. In an open source project, you are expected to read source code to understand what's going on.
I have been.
Though, in this particular case, I needed to dig all the way down to LMDB page layout to grok key indexes, compare to mdb_cursor_next, search for instances of C_DEL, etc. to understand what was happening.
You just needed to read mtest.c to understand the intended behavior.
Things have settled down enough now to formally document it though.
Then make it so. It will save a lot of people a lot of time. :)
Suggest the language for such a change. Patch welcome.