https://bugs.openldap.org/show_bug.cgi?id=10346
Issue ID: 10346 Summary: mdb_env_copy2 on a database with a value larger than (2GB-16) results in a corrupt copy Product: LMDB Version: 0.9.31 Hardware: x86_64 OS: Linux Status: UNCONFIRMED Keywords: needs_review Severity: normal Priority: --- Component: liblmdb Assignee: bugs@openldap.org Reporter: mike.moritz@vertex.link Target Milestone: ---
Created attachment 1072 --> https://bugs.openldap.org/attachment.cgi?id=1072&action=edit reproduction source code
Running mdb_env_copy2 with compaction on a database with a value larger than (2GB-16)bytes appears to complete successfully in that there are no errors, but the copied database cannot be opened and throws an MDB_CORRUPTED error. Looking at the copied database size, it appears that the value is either being skipped or significantly truncated. Running mdb_env_copy2 without compaction also completes successfully, and the copied database can be opened.
I initially encountered this while using py-lmdb with v0.9.31 of LMDB, but was able to write up a simple script that uses the library directly. The source for the script is attached, and the results below are from running it with the latest from master.
Without compaction: $ ./lmdb_repro test.lmdb $((2 * 1024 * 1024 * 1024 - 16 + 1)) testbak.lmdb LMDB Version: LMDB 0.9.70: (December 19, 2015) Set LMDB map size to 21474836330 bytes Successfully inserted key with 2147483633 bytes of zero-filled data Retrieved 2147483633 bytes of data First 16 bytes (hex): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ...
Copying database to testbak.lmdb... Database copy completed successfully.
Opening copied database and reading value... Retrieved 2147483633 bytes of data from copied database First 16 bytes from copy (hex): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... Data size matches between original and copy
With compaction: $ ./lmdb_repro -c test.lmdb $((2 * 1024 * 1024 * 1024 - 16 + 1)) testbak.lmdb LMDB Version: LMDB 0.9.70: (December 19, 2015) Set LMDB map size to 21474836330 bytes Successfully inserted key with 2147483633 bytes of zero-filled data Retrieved 2147483633 bytes of data First 16 bytes (hex): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ...
Copying database to testbak.lmdb (with compaction)... Database copy completed successfully.
Opening copied database and reading value... mdb_get (copy) failed: MDB_CORRUPTED: Located page was wrong type
Size difference on corrupt DB: $ du -sh ./* 312K ./lmdb_repro 24K ./testbak.lmdb 2.1G ./test.lmdb
With compaction at the perceived max size: $ ./lmdb_repro -c test.lmdb $((2 * 1024 * 1024 * 1024 - 16)) testbak.lmdb LMDB Version: LMDB 0.9.70: (December 19, 2015) Set LMDB map size to 21474836320 bytes Successfully inserted key with 2147483632 bytes of zero-filled data Retrieved 2147483632 bytes of data First 16 bytes (hex): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ...
Copying database to testbak.lmdb (with compaction)... Database copy completed successfully.
Opening copied database and reading value... Retrieved 2147483632 bytes of data from copied database First 16 bytes from copy (hex): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... Data size matches between original and copy
https://bugs.openldap.org/show_bug.cgi?id=10346
--- Comment #1 from mike.moritz@vertex.link --- Created attachment 1073 --> https://bugs.openldap.org/attachment.cgi?id=1073&action=edit naive patch possibly identifying the problem ints
I am not familiar enough with the code to suggest a robust fix, but this patch seems to identify the problem ints.