slap_sl_malloc of X bytes failed, using ch_malloc

List overview All Threads
Download

newer

older

Embedding Other LDAP Server in...

Re: C coding secured LDAP

kevin montuori

1 Apr 2008 1 Apr '08

11:11 a.m.

hi all -- i have a problem with a 2-multi-master, 1-replica setup. my master servers' directories sync up and stay replicated without too many issues; however, when i start up the replica i get this message on the master that i'm sync'ing the replica from: slap_sl_malloc of 138718824 bytes failed, using ch_malloc

and, of course, the slapd dies. this is 100% repeatable.

i noticed that this has been an issue in the past (it cropped up on the mailing list around december 2007) and was curious if it's a known issue or a misconfiguration or what.

i'm running 2.4.8 on linux 2.6.18.8-32bit-5-xenU.

thanks for any insight.

-- kevin montuori montuori@gmail.com

Attachments:

attachment.htm (text/html — 962 bytes)

Show replies by date

Aaron Richton

1 Apr 1 Apr

12:21 p.m.

On Tue, 1 Apr 2008, kevin montuori wrote:

...

slap_sl_malloc of 138718824 bytes failed, using ch_malloc

Well, always start with the obvious: are you actually out of memory?

kevin montuori

1:34 p.m.

...

...
...
...
...
"AR" == Aaron Richton richton@nbcs.rutgers.edu writes:

...

...
slap_sl_malloc of 138718824 bytes failed, using ch_malloc

AR> Well, always start with the obvious: are you actually out of AR> memory?

heh. that wouldn't have been funny. i'm not out of memory though; i'm showing there's ~1.6G (real memory) free at the time of the malloc call and the swap has never been touched.

thanks for the idea though.

-- kevin montuori montuori@gmail.com

Quanah Gibson-Mount

2:52 p.m.

--On April 1, 2008 4:34:32 PM -0400 kevin montuori montuori@gmail.com wrote:

...

...
...
...
...
...
"AR" == Aaron Richton richton@nbcs.rutgers.edu writes:

...
...
slap_sl_malloc of 138718824 bytes failed, using ch_malloc

AR> Well, always start with the obvious: are you actually out of AR> memory?

heh. that wouldn't have been funny. i'm not out of memory though; i'm showing there's ~1.6G (real memory) free at the time of the malloc call and the swap has never been touched.

thanks for the idea though.

Are you running as root, or as a user though? If a user, does the user have memory limits?

--Quanah

Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration

kevin montuori

2 Apr 2 Apr

5:41 a.m.

...

...
...
...
...
"QG" == Quanah Gibson-Mount quanah@zimbra.com writes:

QG> Are you running as root, or as a user though? If a user, does the QG> user have memory limits?

the user has no hard or soft memory limits, except for a soft 8MB stack limit.

thanks, k.

-- kevin montuori montuori@gmail.com

Pierangelo Masarati

8:10 a.m.

kevin montuori wrote:

...

...
...
...
...
...
"QG" == Quanah Gibson-Mount quanah@zimbra.com writes:

QG> Are you running as root, or as a user though? If a user, does the QG> user have memory limits?

the user has no hard or soft memory limits, except for a soft 8MB stack limit.

Is the issue repeatable? If it is, can you ask slapd to generate a core file, and provide a stack backtrace? See http://www.openldap.org/faq/data/cache/59.html for further instructions, and make sure you use an unstripped binary. In case, keep the core and the slapd binary 'round, since we might need to ask you to print some values from the stack.

Ing. Pierangelo Masarati OpenLDAP Core Team

SysNet s.r.l. via Dossi, 8 - 27100 Pavia - ITALIA http://www.sys-net.it --------------------------------------- Office: +39 02 23998309 Mobile: +39 333 4963172 Email: pierangelo.masarati@sys-net.it ---------------------------------------

kevin montuori

10:17 a.m.

...

...
...
...
...
"PM" == Pierangelo Masarati ando@sys-net.it writes:

PM> Is the issue repeatable? If it is, can you ask slapd to generate a PM> core file, and provide a stack backtrace?

it is and absolutely. the results of (gdb) bt full can be found here:

http://homepage.mac.com/ignavusinfo/ldap-backtrace.txt

again, thanks for the help. k.

-- kevin montuori montuori@gmail.com

Pierangelo Masarati

11:25 a.m.

kevin montuori wrote:

...

...
...
...
...
...
"PM" == Pierangelo Masarati ando@sys-net.it writes:

PM> Is the issue repeatable? If it is, can you ask slapd to generate a PM> core file, and provide a stack backtrace?

it is and absolutely. the results of (gdb) bt full can be found here:

http://homepage.mac.com/ignavusinfo/ldap-backtrace.txt

again, thanks for the help.

Mmmmh, this issue definitely looks like ITS#5437 and ITS#5444; it has nothing to do with the error message in subject. Since I cannot tell whether the two issues are related, or you hit another, already pointed out issue, can you make sure you can repeat the malloc failure issue? I don't want to load you with unnecessary effort; if the real issue is the syncprov_done_ctrl()-related issue, and the malloc failure one is simply a symptom, then you probably don't need to do anything else. Please follow the discussion of the above mentioned ITS-es to find out if the erroneous behavior you see is related.

Thanks, p.

Ing. Pierangelo Masarati OpenLDAP Core Team

kevin montuori

11:51 a.m.

...

...
...
...
...
"PM" == Pierangelo Masarati ando@sys-net.it writes:

PM> Since I cannot tell whether the two issues are related, or you hit PM> another, already pointed out issue, can you make sure you can PM> repeat the malloc failure issue?

i can. that is to say, each time i start up the replica things chug along for a couple seconds and then i receive the malloc failed message followed by a segfault (both the error and segfault are on the master). i'm happy to share debug output from either the master or replica if that'd help.

PM> Please follow the discussion of the above mentioned ITS-es to find PM> out if the erroneous behavior you see is related.

note that it's a little different than ITS#5444 in that i don't have to perform any action explicitly to cause the master to segfault. it's also, unlike what's mentioned in your followup to 5444, the provider that's crashing, not the consumer.

i'll keep an eye on the two tickets. if there's any further debug information i can provide or testing i could perform, please let me know.

-- kevin montuori montuori@gmail.com

Pierangelo Masarati

12:21 p.m.

kevin montuori wrote:

...

...
...
...
...
...
"PM" == Pierangelo Masarati ando@sys-net.it writes:

PM> Since I cannot tell whether the two issues are related, or you hit PM> another, already pointed out issue, can you make sure you can PM> repeat the malloc failure issue?

i can. that is to say, each time i start up the replica things chug along for a couple seconds and then i receive the malloc failed message followed by a segfault (both the error and segfault are on the master). i'm happy to share debug output from either the master or replica if that'd help.

PM> Please follow the discussion of the above mentioned ITS-es to find PM> out if the erroneous behavior you see is related.

note that it's a little different than ITS#5444 in that i don't have to perform any action explicitly to cause the master to segfault. it's also, unlike what's mentioned in your followup to 5444, the provider that's crashing, not the consumer.

OK. What I believe is common with those ITSes is that the crash is the result of calling syncprov_done_ctrl() with a corrupted/non-initialized cookie.

Ing. Pierangelo Masarati OpenLDAP Core Team

Howard Chu

1:52 p.m.

Pierangelo Masarati wrote:

...

kevin montuori wrote:

...
...
...
...
...
> "PM" == Pierangelo Masaratiando@sys-net.it writes:

PM> Is the issue repeatable? If it is, can you ask slapd to generate a PM> core file, and provide a stack backtrace?

it is and absolutely. the results of (gdb) bt full can be found here:

http://homepage.mac.com/ignavusinfo/ldap-backtrace.txt

again, thanks for the help.

Mmmmh, this issue definitely looks like ITS#5437 and ITS#5444; it has nothing to do with the error message in subject. Since I cannot tell whether the two issues are related, or you hit another, already pointed out issue, can you make sure you can repeat the malloc failure issue? I don't want to load you with unnecessary effort; if the real issue is the syncprov_done_ctrl()-related issue, and the malloc failure one is simply a symptom, then you probably don't need to do anything else. Please follow the discussion of the above mentioned ITS-es to find out if the erroneous behavior you see is related.

Something doesn't make sense in this trace. In frame 10, "changed = 1" but that is an impossible value for this variable. It must be either 0 or SS_CHANGED (2). Since the code in syncprov.c line 2018 relies on the SS_CHANGED value to be set in order to initialize the cookie, and that isn't being set correctly, the syncprov_done_ctrl invocation breaks.

I suggest you try recompiling without optimization and see if the behavior changes. If it still crashes the same way, please post a new trace.

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

kevin montuori

3 p.m.

...

...
...
...
...
"HC" == Howard Chu hyc@symas.com writes:

HC> I suggest you try recompiling without optimization and see if the HC> behavior changes. If it still crashes the same way, please post a HC> new trace.

i've recompiled without the -O2 flags and find slapd does indeed exhibit the same behavior. there's a backtrace at:

http://homepage.mac.com/ignavusinfo/ldap-backtrace-no-optimization.txt

-- kevin montuori montuori@gmail.com

Howard Chu

3 Apr 3 Apr

12:03 a.m.

kevin montuori wrote:

...

...
...
...
...
...
"HC" == Howard Chuhyc@symas.com writes:

HC> I suggest you try recompiling without optimization and see if the HC> behavior changes. If it still crashes the same way, please post a HC> new trace.

i've recompiled without the -O2 flags and find slapd does indeed exhibit the same behavior. there's a backtrace at:

http://homepage.mac.com/ignavusinfo/ldap-backtrace-no-optimization.txt

I believe this is now fixed in CVS HEAD. See the patch to syncprov.c rev 1.227

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

6309

Age (days ago)

6311

Last active (days ago)

openldap-software@openldap.org

12 comments

5 participants

tags (0)

participants (5)

Aaron Richton
Howard Chu
kevin montuori
Pierangelo Masarati
Quanah Gibson-Mount