--000000000000d405310575a6de5e Content-Type: text/plain; charset="UTF-8"
Hi Howard and thanks for the info.
I've read your insights about the corruption nature and I'd like to see if I can write some sort of integrity check before I try to open the file. Perhaps you could guide me where to look for the file format, where can I find all the meta-pages and in which offset can I find the used pages ? Basically, I wish to check the latest meta-page and see if the number of pages correspond with the file actual size.
Second, as I've stated, my target operating system is macOS.. are you familiar of any known issues with file atomicity in this particular OS ? Perhaps you could tell me how it's implemented under the hood to achieve data consistency if I'm using the database using MDB_NOMETASYNC, so I further investigate this issue and address it to Apple if necessary.
Irad
On Mon, Sep 10, 2018 at 8:44 PM Howard Chu hyc@symas.com wrote:
iradization@gmail.com wrote:
Full_Name: Irad.k Version: recent from GitHub. OS: macOS URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (82.81.84.130)
Hi,
I'm working with LMDB with the following flags : MDB_NOMETASYNC |
MDB_NOSUBDIR
| MDB_NOTLS.
Somehow, even though the DB should be ACI(without the D), it got
corrupted
after recovering from kernel panic, and It crashes my process when
trying to
access one of the records (see crash log below).
Here's a link to the file :
https://drive.google.com/file/d/12Q3KYYrapiJOgiaccnDL3tQACkqrM3C5/view?usp=s...
This file is only 6 pages long, but the latest meta page claims that it's using 11 pages. Looking at the previous meta page, it claims that 9 pages were used.
Clearly your OS failed to write the complete contents out before it panicked. At the very least the file should have been 9 pages long. Sounds like you have an OS bug.
According to the crash log from the process, It can clearly be seen that
the
invalid address reside inside the mapped file region which is the lmdb
mapped
file, but still I get KERN_MEMORY_ERROR on that address.
From what I know, an attempt to access address within the mapped range
can
either retrieve the page contents directly from memory (if it's already
there),
or trigger page fault trap that eventually lead to reading the missing
data from
disk and return it to process as well.
One thing that raise some concerns is that the file size is only 24k and
the
mapping spans over 256M. However, the file's meta data seems to be
coherent to
file contents.
Any idea how did it happen, and what exactly in the file cause this
corruption ?
CRASH LOG :
Exception Type: EXC_BAD_ACCESS (SIGBUS) Exception Codes: KERN_MEMORY_ERROR at 0x000000010648800a Exception Note: EXC_CORPSE_NOTIFY
Termination Signal: Bus error: 10 Termination Reason: Namespace SIGNAL, Code 0xa Terminating Process: exc handler [0]
VM Regions Near 0x10648800a:
__LINKEDIT 0000000106464000-000000010647f000 [ 108K]
r--/rwx SM=COW
^Z^C [/usr/lib/dyld] --> mapped file 000000010647f000-000000011647f000 [256.0M]
r--/rwx
SM=PRV Object_id=9034edd9 STACK GUARD 000070000f2b3000-000070000f2b4000 [ 4K]
---/rwx SM=NUL
stack guard for thread 1
Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 myprog 0x0000000101666756 mdb_page_search_root +
39
1 myprog 0x00000001016660f7 mdb_page_search + 182 2 myprog 0x00000001016614de mdb_cursor_set + 88 3 myprog 0x0000000101661476 mdb_get + 134
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
--000000000000d405310575a6de5e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr">Hi Howard and thanks for= the info.=C2=A0<div><br><div>I've read your insights about the corrupt= ion nature and I'd like to see if I can write some sort of integrity ch= eck before I try to open the file.=C2=A0</div><div>Perhaps you could guide = me where to look for the file format, where can I find all the meta-pages a= nd in which offset can I find the used pages ?=C2=A0</div><div>Basically, I= wish to check the latest meta-page and see if the number of pages correspo= nd with the file actual size.=C2=A0<br></div><div><br></div><div>Second, as= I've stated, my target operating system is macOS.. are you familiar of= any known issues with file atomicity in this particular OS ?=C2=A0</div></= div><div><div>Perhaps you could tell me how it's implemented under the = hood to achieve data consistency if I'm using the database using MDB_NO= METASYNC, so I further investigate this issue and address it to Apple if ne= cessary.=C2=A0</div></div><div><br></div><div>Irad</div></div></div></div><= br><div class=3D"gmail_quote"><div dir=3D"ltr">On Mon, Sep 10, 2018 at 8:44= PM Howard Chu <<a href=3D"mailto:hyc@symas.com">hyc@symas.com</a>> w= rote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex= ;border-left:1px #ccc solid;padding-left:1ex"><a href=3D"mailto:iradization= @gmail.com" target=3D"_blank">iradization@gmail.com</a> wrote:<br> > Full_Name: Irad.k<br> > Version: recent from GitHub.<br> > OS: macOS<br> > URL: <a href=3D"ftp://ftp.openldap.org/incoming/" rel=3D"noreferrer" t= arget=3D"_blank">ftp://ftp.openldap.org/incoming/</a><br> > Submission from: (NULL) (82.81.84.130)<br> > <br> > <br> > Hi, <br> > <br> > I'm working with LMDB with the following flags :=C2=A0 MDB_NOMETAS= YNC | MDB_NOSUBDIR<br> > | MDB_NOTLS.<br> > <br> > Somehow, even though the DB should be ACI(without the D),=C2=A0 it got= corrupted<br> > after recovering from kernel panic, and It crashes my process when try= ing to<br> > access one of the records (see crash log below). <br> > <br> > Here's a link to the file : <br> > <br> > <a href=3D"https://drive.google.com/file/d/12Q3KYYrapiJOgiaccnDL3tQACk= qrM3C5/view?usp=3Dsharing" rel=3D"noreferrer" target=3D"_blank">https://dri= ve.google.com/file/d/12Q3KYYrapiJOgiaccnDL3tQACkqrM3C5/view?usp=3Dsharing</= a><br> <br> This file is only 6 pages long, but the latest meta page claims that it'= ;s using 11 pages.<br> Looking at the previous meta page, it claims that 9 pages were used.<br> <br> Clearly your OS failed to write the complete contents out before it panicke= d. At the<br> very least the file should have been 9 pages long. Sounds like you have an = OS bug.<br> <br> > According to the crash log from the process, It can clearly be seen th= at the<br> > invalid address reside inside the mapped file region which is the lmdb= mapped<br> > file, but still I get KERN_MEMORY_ERROR on that address.<br> > <br> >>From what I know, an attempt to access address within the mapped ra= nge can<br> > either retrieve the page contents directly from memory (if it's al= ready there),<br> > or trigger page fault trap that eventually lead to reading the missing= data from<br> > disk and return it to process as well.<br> > <br> > One thing that raise some concerns is that the file size is only 24k a= nd the<br> > mapping spans over 256M. However, the file's meta data seems to be= coherent to<br> > file contents.<br> > <br> > Any idea how did it happen, and what exactly in the file cause this co= rruption ?<br> > <br> > <br> > CRASH LOG :<br> > ----------------------------------------------------------------------= <br> > Exception Type:=C2=A0 =C2=A0 =C2=A0 =C2=A0 EXC_BAD_ACCESS (SIGBUS)<br> > Exception Codes:=C2=A0 =C2=A0 =C2=A0 =C2=A0KERN_MEMORY_ERROR at 0x0000= 00010648800a<br> > Exception Note:=C2=A0 =C2=A0 =C2=A0 =C2=A0 EXC_CORPSE_NOTIFY<br> > <br> > Termination Signal:=C2=A0 =C2=A0 Bus error: 10<br> > Termination Reason:=C2=A0 =C2=A0 Namespace SIGNAL, Code 0xa<br> > Terminating Process:=C2=A0 =C2=A0exc handler [0]<br> > <br> > VM Regions Near 0x10648800a:<br> > <br> > __LINKEDIT=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0000000010646= 4000-000000010647f000 [=C2=A0 108K] r--/rwx SM=3DCOW<br> >=C2=A0 ^Z^C [/usr/lib/dyld]<br> > --> mapped file=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0000000106= 47f000-000000011647f000 [256.0M] r--/rwx<br> > SM=3DPRV=C2=A0 Object_id=3D9034edd9<br> > STACK GUARD=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 000070000f2b3000-= 000070000f2b4000 [=C2=A0 =C2=A0 4K] ---/rwx SM=3DNUL<br> >=C2=A0 stack guard for thread 1<br> > <br> > Thread 0 Crashed:: Dispatch queue: com.apple.main-thread<br> > 0=C2=A0 =C2=A0myprog=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A00x0000000101666756 mdb_page_search_root + 39<br> > 1=C2=A0 =C2=A0myprog=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A00x00000001016660f7 mdb_page_search + 182<br> > 2=C2=A0 =C2=A0myprog=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A00x00000001016614de mdb_cursor_set + 88<br> > 3=C2=A0 =C2=A0myprog=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A00x0000000101661476 mdb_get + 134<br> > <br> > <br> > <br> <br> <br> -- <br> =C2=A0 -- Howard Chu<br> =C2=A0 CTO, Symas Corp.=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<a href=3D"= http://www.symas.com" rel=3D"noreferrer" target=3D"_blank">http://www.symas= .com</a><br> =C2=A0 Director, Highland Sun=C2=A0 =C2=A0 =C2=A0<a href=3D"http://highland= sun.com/hyc/" rel=3D"noreferrer" target=3D"_blank">http://highlandsun.com/h= yc/</a><br> =C2=A0 Chief Architect, OpenLDAP=C2=A0 <a href=3D"http://www.openldap.org/p= roject/" rel=3D"noreferrer" target=3D"_blank">http://www.openldap.org/proje= ct/</a><br> </blockquote></div>
--000000000000d405310575a6de5e--