Full_Name: Gawen Arab Version: 0.9.14 OS: OSX URL: http://gawen.me/pub/lmdb_crash/gawen-150223.tar.xz Submission from: (NULL) (77.151.59.7)
Hello,
I'm currently developing a virtual filesystem whose metadata are stored in a LMDB 0.9.14 database. This software can currently run over Android, iOS, Linux, OSX and Windows.
This software is currently used by 50 users and I received 11 crash reports of SIGSEV raised in LMDB. All reports come from OSX computers.
mdb.c:5382: Assertion 'IS_LEAF(mp)' failed in mdb_cursor_next()
This comes from this assertion https://gitorious.org/mdb/mdb/source/2f587ae081d076e3707360c5db086520c219d3e... .
This happens when the software iterates over keys in the database, the same way mdb_dump does, calling in loop mdb_cursor_get(..., MDB_NEXT).
I managed to get back the LMDB db folder from one user which runs on OSX 64bits. I performed a mdb_dump of it with my Linux 64bits. mdb_dump crashes the same way. I tried with the lastest mdb_dump from the mdb.master branch (commit 3368d1f5e243225cba4d730fba19ff600798ebe3), but I have the same assertion.
mdb.c:5599: Assertion 'IS_LEAF(mp)' failed in mdb_cursor_next()
I attached to this issue the faulty database 'db/'.
The output of mdb_dump on my machine is attached to this issue named as the file 'mdb_dump.log'. I runned back mdb_dump with MDB_DEBUG defined to 2 and mdb_debug true. I attached the output of this new dump as the file 'mdb_dump.debug.log'. The interesting part:
mdb_cursor_next:5574 cursor_next: top page is 1946 in cursor 0x1897350 mdb_cursor_next:5591 ==> cursor points to page 1946 with 50 keys, key index 48 b5f1ff0700000000 0700000000000000bc10000000000000000e000000000000860d000000000000e308000000000000ee07000000000000a5070000000000000307000000000000 mdb_cursor_next:5574 cursor_next: top page is 1946 in cursor 0x1897350 mdb_cursor_next:5591 ==> cursor points to page 1946 with 50 keys, key index 49 b6f1ff0700000000 0700000000000000820e000000000000310e000000000000770d000000000000660c000000000000cb0a0000000000002a080000000000000008000000000000 mdb_cursor_next:5574 cursor_next: top page is 1946 in cursor 0x1897350 mdb_cursor_next:5579 =====> move to next sibling page mdb_cursor_pop:5095 popped page 1946 off db 1 cursor 0x1897350 mdb_cursor_sibling:5503 parent page is page 4643, index 16 mdb_cursor_sibling:5521 just moving to right index key 17 mdb_cursor_push:5104 pushing page 704 on db 1 cursor 0x1897350 mdb_cursor_next:5585 next page is 704, key index 0 mdb_cursor_next:5591 ==> cursor points to page 704 with 7 keys, key index 0 mdb.c:5599: Assertion 'IS_LEAF(mp)' failed in mdb_cursor_next() %%0 I do not know much about internal LMDB design for now, so I'm struggling to understand the debug lines...
Unfortunately, I do not know how to reproduce this bug, but it is a recurring one.
Maybe the following information can help you,
- My software opens the database in MDB_NOSYNC mode, and performs a mdb_env_sync(env, 1) every second or less. A mdb_env_sync can happen at the same time of any other operation (i.e. mdb_txn_commit). I didn't put any lock mechanism. Could this expln n such situation ?
- mapsize is currently 512MiB. Previous versions of the software set to smaller values (the database had not been rebuilt since).
I'd be happy to provide more information if you need.
Thank you for your support and this amazing IP.
Regards