https://bugs.openldap.org/show_bug.cgi?id=9920
Issue ID: 9920 Summary: MDB_PAGE_FULL with master3 (encryption) because there is no room for the authentication data (MAC) Product: LMDB Version: unspecified Hardware: x86_64 OS: Mac OS Status: UNCONFIRMED Keywords: needs_review Severity: normal Priority: --- Component: liblmdb Assignee: bugs@openldap.org Reporter: info@parlepeuple.fr Target Milestone: ---
Created attachment 915 --> https://bugs.openldap.org/attachment.cgi?id=915&action=edit proposed patch
Hello,
on master3, using the encryption at rest feature, I am testing as follow: - on a new named database, i set the encryption function with mdb_env_set_encrypt(env, encfunc, &enckey, 32) - note that I chose to have a size parameter (The size of authentication data in bytes, if any. Set this to zero for unauthenticated encryption mechanisms.) of 32 bytes. - I add 2 entries on the DB, trying to saturate the first page. I chose to add a key of 33 Bytes and a value of 1977 Bytes, so the size of each node is 2010 Bytes (obviously the 2 keys are different). - This passes and the DB has just one leaf_pages, no overflow_pages, no branch_pages, an a depth of 1. - If I add one byte to the values I insert (starting again from a blank DB), then , instead of seeing 2 overflow_pages, I get an error : MDB_PAGE_FULL. - this clearly should not have happened. - Here is some tracing : add to leaf page 2 index 0, data size 48 key size 7 [74657374646200] add to leaf page 3 index 0, data size 1978 key size 33 [000000000000000000000000000000000000000000000000000000000000000000] add to branch page 5 index 0, data size 0 key size 0 [null] add to branch page 5 index 1, data size 0 key size 33 [000000000000000000000000000000000000000000000000000000000000000000] add to leaf page 4 index 0, data size 1978 key size 33 [000000000000000000000000000000000000000000000000000000000000000000] add to leaf page 4 index 1, data size 1978 key size 33 [020202020202020202020202020202020202020202020202020202020202020202] not enough room in page 4, got 1 ptrs upper-lower = 2020 - 2 = 2016 node size = 2020
Looking at the code, I understand that there is a problem at line 9005 : } else if (node_size + data->mv_size > mc->mc_txn->mt_env->me_nodemax) {
where me_nodemax is incorrect, as it is not taking into account that some bytes will be needed for the MAC authentication code, which size is in env->me_esumsize.
me_nodemax is calculated at line 5349: env->me_nodemax = (((env->me_psize - PAGEHDRSZ ) / MDB_MINKEYS) & -2) - sizeof(indx_t);
So I substract me_esumsize with a "- env->me_esumsize" here:
env->me_nodemax = (((env->me_psize - PAGEHDRSZ - env->me_esumsize) / MDB_MINKEYS) & -2) - sizeof(indx_t);
I also substract it from me_maxfree_1pg in the line above, and in pmax in line 10435.
I do not know if my patch is correct, but it solves the issue. Maybe there are other places in the code where the me_esumsize should be substracted from the available size. By example, when calculating the number of overflow pages in OVPAGES, it does not take into account me_esumsize, but I think it is ok, because there is only one MAC for the entire set of OV pages, and there is room for it in the first OV page.
See the attached proposed patch.
https://bugs.openldap.org/show_bug.cgi?id=9920
--- Comment #1 from NikoPLP info@parlepeuple.fr --- Hello again,
Some more input about this.
I really bump int nasty errors when using the authenticated encryption feature of master3.
What I noticed now is that some entries in the db are corrupted, when using authentication (MAC tag).
at the end of some entries, I find some trailing zeros instead of the data I added.
In fact, the number of trailing zeros is equal to the size of the MAC minus 1 or 2 (depending of it is a bigdata or not).
I was not able to trace down the problem exactly.
But I can confirm that the data on the disk is not corrupted. The problem occurs at the moment of the read (mdb_get)
The buffer that is passed to the encryption function already contains the zeros. It is not a problem with my encryption function (which is just doing a memcpy for now, to facilitate the debugging)
The size of the buffers is correct, and there is no segfault.
But I find those zeros at the end of the value.
disabling the MAC authentication solves the problem.
I could not find at which point those zeros are added to the end of the buffer.
Your help and some pointers on how to solve that would be greatly appreciated. Thanks !
https://bugs.openldap.org/show_bug.cgi?id=9920
--- Comment #2 from Howard Chu hyc@openldap.org --- Please provide minimal sample code demonstrating the problem.
https://bugs.openldap.org/show_bug.cgi?id=9920
--- Comment #3 from NikoPLP info@parlepeuple.fr --- Hello,
Can you please review already the first part of the issue, that i posted on the 24th. It has pseudo code, and I would like to get your input on that.
For the second part, I will try to extract some code. But we are already deep into our application code. Many calls are done to the LMDB library before the issue arises... so i can reproduce it easily while testing my app, but i dont have a set of lmdb api calls in one file to give you. This is much more work from me already.
Looking forward to see your feedback already on the descriptions i already posted.
Thanks
https://bugs.openldap.org/show_bug.cgi?id=9920
--- Comment #4 from Howard Chu hyc@openldap.org --- For the first part, I've created a different patch from yours
https://git.openldap.org/openldap/openldap/-/merge_requests/567
Will try to duplicate your results later.
https://bugs.openldap.org/show_bug.cgi?id=9920
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Keywords|needs_review |
https://bugs.openldap.org/show_bug.cgi?id=9920
--- Comment #5 from Howard Chu hyc@openldap.org --- (In reply to NikoPLP from comment #3)
Hello,
Can you please review already the first part of the issue, that i posted on the 24th. It has pseudo code, and I would like to get your input on that.
For the second part, I will try to extract some code. But we are already deep into our application code. Many calls are done to the LMDB library before the issue arises... so i can reproduce it easily while testing my app, but i dont have a set of lmdb api calls in one file to give you. This is much more work from me already.
Looking forward to see your feedback already on the descriptions i already posted.
Thanks
I'm unable to reproduce the data corruption you described. The pagesize page has been committed as b9db2582cb31aa0ec88371db388095cc31ceb2f4
https://bugs.openldap.org/show_bug.cgi?id=9920
--- Comment #6 from NikoPLP info@parlepeuple.fr --- Thanks Howard for the fix of pagesizes. I had tried the merge request back then and it was fixing the MDB_PAGE_FULL issue, but it had no impact on the other issue of corrupted data. About that corruption of values during read operations, i will send you soon a C file with a reproducible test case. Sorry i couldn't do it recently as i was very busy with other things (and i deactivated authenticated data temporarily so i could continue working. I am coming back to you within a week or two. thanks again
https://bugs.openldap.org/show_bug.cgi?id=9920
Howard Chu hyc@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Ever confirmed|0 |1 Status|UNCONFIRMED |IN_PROGRESS
--- Comment #7 from Howard Chu hyc@openldap.org --- Have you got a reproducer test case for us?