From nejasmicz@gmail.com Thu Oct 3 12:48:33 2013 From: nejasmicz@gmail.com To: openldap-bugs@openldap.org Subject: Re: (ITS#7715) SIGBUS when mdb is configured with writemap Date: Thu, 03 Oct 2013 12:48:32 +0000 Message-ID: <201310031248.r93CmWnk033848@boole.openldap.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2475205734077375343==" --===============2475205734077375343== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable --001a11c253c0f33b8604e7d59850 Content-Type: text/plain; charset=3DUTF-8 Content-Transfer-Encoding: quoted-printable On Wed, Oct 2, 2013 at 6:43 PM, wrote: > hyc(a)symas.com wrote: > > =3DC5=3DBDeljko Neja=3DC5=3DA1mi=3DC4=3D87 wrote: > >> Here you go http://hastebin.com/fukecejuje.tex > > > > Interestingly enough, I got the same result as you on an initial > compile/run > > of slapd. Unfortunately, with optimization, the backtrace wasn't all th=3D at > > useful. Recompiling back-mdb with just -g, no optimization, gets a > different > > result though - slapd is fine, and ldclt dies with a heap corruption or > > double-free. > > And now I cannot reproduce the original SIGBUS at all. But still getting > various crashes in ldclt. > > ldclt -h localhost -p 9011 -D cn=3D3Dmanager,dc=3D3Dexample,dc=3D3Dcom -w s= ecre=3D t -b > ou=3D3Dpeople,dc=3D3Dexample,dc=3D3Dcom -e > object=3D3Dxx.txt,rdn=3D3D'uid:[A=3D3DINCRNNOLOOP(200000;999999;6)]' -e > add,commoncounter > -I 68 > ldclt version 4.23 > ldclt[30207]: Starting at Wed Oct 2 09:39:10 2013 > > *** glibc detected *** ldclt: double free or corruption (fasttop): > 0x00007fc2180021e0 *** > =3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D Backtrace: =3D3D=3D3D=3D3D=3D3D=3D3D=3D= 3D=3D3D=3D3D=3D3D > /lib/x86_64-linux-gnu/libc.so.6(+0x7eb96)[0x7fc223ff9b96] > /usr/lib/x86_64-linux-gnu/libnspr4.so(+0x142d1)[0x7fc224f0b2d1] > /usr/lib/x86_64-linux-gnu/libnspr4.so(+0x1ae74)[0x7fc224f11e74] > /usr/lib/x86_64-linux-gnu/libnspr4.so(PR_Malloc+0x49)[0x7fc224f0d0b9] > /usr/lib/x86_64-linux-gnu/libnspr4.so(+0x129f4)[0x7fc224f099f4] > /usr/lib/x86_64-linux-gnu/libnspr4.so(+0x12068)[0x7fc224f09068] > /usr/lib/x86_64-linux-gnu/libnspr4.so(PR_vsmprintf+0x38)[0x7fc224f09b28] > /usr/lib/x86_64-linux-gnu/libnspr4.so(PR_smprintf+0x8c)[0x7fc224f09bec] > ldclt(+0x663a)[0x7fc22556663a] > ldclt(+0x74a4)[0x7fc2255674a4] > ldclt(+0x9d48)[0x7fc225569d48] > ldclt(threadMain+0x329)[0x7fc2255735d9] > /lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7fc224341e9a] > /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fc22406ecbd] > > Without any other clues, this feels like ASLR is messing with us but that=3D 's > just a wild guess. I can no longer reproduce the SIGBUS in slapd > regardless of > compile options, while ldclt itself keeps dying. If you can find some mor=3D e > reliable way to reproduce the issue that would help. Perhaps using the > client > in test060. I ran the whole test suite for mdb, and as far as I can see, every test returned OK. Found a new way to reproduce the SIGBUS using ldapadd on Ubuntu 12.04 firing to openldap on RH 6.3: ldapadd -h 172.17.101.150 -p 389 -D "cn=3D3Dadmin,dc=3D3Dtest" -w test -f tes= t.=3D ldif -- ldif file is the same as the previous ldclt command. Doubt it matters, but the ldif file is 1M adds. On the RH box: - compiled openldap with -g -O0 and previously used flags gdb `find /root/openldap/ -type d -printf '-d %p '` --args /opt/openldap/libexec/slapd -h "ldap:/// ldapi:///" -F /opt/openldap/etc/openldap/slapd.d -g openldap -u openldap -d 0 gdb output: bt -- http://hastebin.com/hefikekaxi.sh bt 10 full -- http://hastebin.com/vudocosuka.sh I am begging to doubt that the problem could be on my end as the bt seems to point to schema problems (although, I haven't analyzed it in great detail yet). Zeljko --001a11c253c0f33b8604e7d59850 Content-Type: text/html; charset=3DUTF-8 Content-Transfer-Encoding: quoted-printable



On Wed, Oct 2, 2013 at 6:43 PM, <hyc(a)symas.com> wro= te:<=3D br>
hyc(a)symas.com= w=3D rote:
> =3DC5=3DBDeljko Neja=3DC5=3DA1mi=3DC4=3D87 wrote:
>> Here you go http://hastebin.com/fukecejuje.tex
>
> Interestingly enough, I got the same result as you on an initial compi=3D le/run
> of slapd. Unfortunately, with optimization, the backtrace wasn't a=3D ll that
> useful. Recompiling back-mdb with just -g, no optimization, gets a dif=3D ferent
> result though - slapd is fine, and ldclt dies with a heap corruption o=3D r
> double-free.

And now I cannot reproduce the original SIGBUS at all. But still gett=3D ing
various crashes in ldclt.

ldclt -h localhost -p 9011 -D cn=3D3Dmanager,dc=3D3Dexample,dc=3D3Dcom -w sec= ret =3D -b
ou=3D3Dpeople,dc=3D3Dexample,dc=3D3Dcom -e
object=3D3Dxx.txt,rdn=3D3D'uid:[A=3D3DINCRNNOLOOP(200000;999999;6)]' = -e a=3D dd,commoncounter
-I 68
ldclt version 4.23
ldclt[30207]: Starting at Wed Oct =3DC2=3DA02 09:39:10 2013

*** glibc detected *** ldclt: double free or corruption (fasttop):
0x00007fc2180021e0 ***
=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D Backtrace: =3D3D=3D3D=3D3D=3D3D=3D3D=3D3D= =3D3D=3D3D=3D3D
/lib/x86_64-linux-gnu/libc.so.6(+0x7eb96)[0x7fc223ff9b96]
/usr/lib/x86_64-linux-gnu/libnspr4.so(+0x142d1)[0x7fc224f0b2d1]
/usr/lib/x86_64-linux-gnu/libnspr4.so(+0x1ae74)[0x7fc224f11e74]
/usr/lib/x86_64-linux-gnu/libnspr4.so(PR_Malloc+0x49)[0x7fc224f0d0b9]
/usr/lib/x86_64-linux-gnu/libnspr4.so(+0x129f4)[0x7fc224f099f4]
/usr/lib/x86_64-linux-gnu/libnspr4.so(+0x12068)[0x7fc224f09068]
/usr/lib/x86_64-linux-gnu/libnspr4.so(PR_vsmprintf+0x38)[0x7fc224f09b28] /usr/lib/x86_64-linux-gnu/libnspr4.so(PR_smprintf+0x8c)[0x7fc224f09bec]
ldclt(+0x663a)[0x7fc22556663a]
ldclt(+0x74a4)[0x7fc2255674a4]
ldclt(+0x9d48)[0x7fc225569d48]
ldclt(threadMain+0x329)[0x7fc2255735d9]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7fc224341e9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fc22406ecbd]

Without any other clues, this feels like ASLR is messing with us but that&#=3D 39;s
just a wild guess. I can no longer reproduce the SIGBUS in slapd regardless=3D of
compile options, while ldclt itself keeps dying. If you can find some more<=3D br> reliable way to reproduce the issue that would help. Perhaps using the clie=3D nt
in test060.

I ran the whole test suite for =3D mdb, and as far as I can see, every test returned OK.

<=3D div>Found a new way to reproduce the SIGBUS using ldapadd on Ubuntu 12.04 f=3D iring to openldap on RH 6.3:
ldapadd -h 172.17.101.150 -p 389 -D "cn=3D3Dadmin,dc=3D3Dtest= &q=3D uot; -w test -f test.ldif -- ldif file is the same as the previous ldclt co=3D mmand. Doubt it matters, but the ldif file is 1M adds.

On the RH box:
-=3D compiled openldap with -g -O0 and previously used flags
gdb `find /root/openldap/ -type d -printf '-d %p '` --args /opt/ope=3D nldap/libexec/slapd -h "ldap:/// ldapi:///" -F /opt/openldap/etc/=3D openldap/slapd.d -g openldap -u openldap -d 0

gdb output:
bt 10 full --=3DC2=3DA0http://hastebin.com/vudocosuka.sh

I am begg= in=3D g to doubt that the problem could be on my end as the bt seems to point to =3D schema problems (although, I haven't analyzed it in great detail yet).<=3D /div>


<=3D div class=3D3D"gmail_extra">Zeljko
--001a11c253c0f33b8604e7d59850-- --===============2475205734077375343==--