openldap-bugs August 2009

openldap-bugs@openldap.org

35 participants
225 discussions

(ITS#6276) paused pool can deadlock if writers are waiting
by hyc＠OpenLDAP.org 25 Aug '09

25 Aug '09

Full_Name: Howard Chu Version: RE24/HEAD OS: Solaris 10 URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (76.91.220.157) Submitted by: hyc test050 hung on me after some number of iterations. Unfortunately I didn't save the stack traces, but basically there was one thread waiting in send_ldap_ber() on the write2 cv, and another thread in config_back_add() waiting for a pool pause to succeed. netstat showed that no connections had queued data, so there should have been no reason for the writer to still be waiting. I believe what happened here is that while the writer was waiting (it was a syncprov qtask replaying events for a psearch) the psearch connection got closed. Solaris is using select, and select() doesn't specially distinguish socket close events - they're reported as read events. The deadlock is because we queue read events into the thread pool, and we don't discover they're actually closed sockets until the read thread gets to run and tries to read from the socket (and gets zero bytes back). But since the pool is entering a pause, the reader thread cannot run, so it can't detect the hangup and dispose of the waiting writer. The ideal fix for this is to process hangup events inline in the listener thread instead of pushing them into the thread pool. But that requires being able to cheaply determine that a hangup actually occurred, and select() doesn't give us this information. We could get this info using poll() instead. Since nowadays any POSIX platform that implements select() also implements poll() we can probably just switch to poll() and drop select(). One exception is Windows; Winsock only supports poll() on Windows Vista and newer. (Note, we had a patch that added a connection_hangup() handler for Linux epoll() at one point, but I dropped it later because it seemed to have strange interactions with Samba. Should look into resurrecting it again.) I don't think we can really fix this issue without knowing for certain when hangup events occur. If we're forced to keep using select, that implies that the main listener thread must attempt a read on the socket before deciding how to dispatch the connection. Any thoughts?

1 0

Re: (ITS#6274) nssov: makefile suffix rules too greedy
by hyc＠symas.com 25 Aug '09

25 Aug '09

jonathan(a)phillipoux.net wrote: > Full_Name: Jonathan Clarke > Version: RE24 > OS: Linux > URL: ftp://ftp.openldap.org/incoming/ > Submission from: (NULL) (82.67.204.30) > > > When running "make" in the nssov module from contrib/slapd-modules, I get this: > 8<------------------- > [...]/contrib/slapd-modules/nssov$ make > ../../../libtool --mode=compile gcc -g -O2 -I../../../include > -I../../../include -I../../../servers/slapd -Inss-ldapd -c alias.c nssov.h > libtool: compile: cannot determine name of library object from `nssov.h' > 8<------------------- > > This trivial patch to Makefile corrects this: > 8<------------------- > - $(LIBTOOL) --mode=compile $(CC) $(OPT) $(DEFS) $(INCS) -c $? > + $(LIBTOOL) --mode=compile $(CC) $(OPT) $(DEFS) $(INCS) -c $< > 8<------------------- > > Thanks, fixed in HEAD. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

1 0

(ITS#6275) syncrepl taking long(not sync) when consumer not connect for a moment
by rlvcosta＠yahoo.com 25 Aug '09

25 Aug '09

Full_Name: Rodrigo Luiz Vargas Costa Version: 2.4.17 OS: CentOS release 5.2 (Final) URL: ftp://ftp.openldap.org/incoming/<TBD> Submission from: (NULL) (135.245.8.5) Openldap developers, I have being exchange some information at openldap lists where looks like some improvements are being done in replication for release 2.4.18. The architecture I'm running has 2 machines in MirrorMode in the same subnet(at the same switch). These systems are part of a HA system sharing a VIP and where both machines have slapd running simultaneously(bind to any local interface) and only VIP is exchanged for HA purposes. The issue I'm facing is related, in a general user view, is when I stop the secondary Provider2(master 2) for backup purposes using slapcat. The Provider1(master 1) continues to provide ldap service where some entrances can be created during the time backup is running(no consumer from Provider 2). Even a small number of entrances are different when consumer in Provider 2 connects to Provider 1 then syncrepl enters in the full DB search as expected. For definition purposes I have some memory limitations where I need to limit dncachesize for around 80% of DB entrances. >From a user perspective I see that after cache is filled system enters in some state where synchronization doesn't happen anymore. For full reference(config, gdb, etc), please see file attached in FTP. Then I see 2 issues : 1)Consumer from Provider2, even passed days and only a small number of differences for test purpose happen(no traffic), the syncrepl never ends and there isn't replication(Provider 1 stay continuously consuming 100% CPU); 2)Even I stop the Provider2(then its consumer) I do not see any change in Provider 1 activities. The CPU continues in 100% even passed days what suggest some hang in the thread or logic. I compiled openldap with GDB symbols and then execute some traces in the threads during the state 2 report above. Looks like it stay looping forever locked in some thread lock. I could also note that when in this situation the monitor cache, in a very slow pace, changes the cache in a single entrance. Being more specific : dn: cn=Database 1,cn=Databases,cn=Monitor structuralObjectClass: monitoredObject creatorsName: modifiersName: createTimestamp: 20090821145848Z modifyTimestamp: 20090821145848Z monitoredInfo: bdb monitorIsShadow: TRUE namingContexts: ou=CONTENT,o=domain,c=fr readOnly: FALSE monitorOverlay: syncprov olmBDBEntryCache: 19920 olmBDBDNCache: 3896287 olmBDBIDLCache: 2 olmDbDirectory: /var/openldap-data/bdb1/ entryDN: cn=Database 1,cn=Databases,cn=Monitor subschemaSubentry: cn=Subschema hasSubordinates: TRUE Stays running in the values 3896287 and 3896288. Looks like the memory re-use is being too short causing locks that takes long time causing a non synchronization. I made several GDB traces for different conditions. Please see ftp attachment file for details. Thanks, Rodrigo. PS-> I could not put the file in the openldap ftp. It says device full. Please let me know how can I send this file.

1 0

Re: (ITS#6257) libldap: getopt flag to return the SASL username
by michael＠stroeder.com 25 Aug '09

25 Aug '09

masarati(a)aero.polimi.it wrote: >> masarati(a)aero.polimi.it wrote: > >> I'd appreciate it very much if it would be exactly behave in the same way >> like >> all other string-valued options. > > On a somewhat related issue, I note that LDAP_OPT_X_SASL_MECHLIST returns > a pointer to an array of chars that apparently cannot be mucked with. > > Assuming my understanding is correct, I wonder if this behavior is > desirable or not, given the fact that if another mech is added, e.g. by > adding a dynamic module, I expect this list to change. These are SASL mechs with the plugin modules. Right? >From an operational standpoint: If a SASL plugin module for a mech was added I think it's acceptable that a software which queries this option is restarted before this SASL mech is known to the software. Probably one has to add additional configuration for this SASL mech. Now the question is what happens if a SASL plugin module is removed and the software trys to use the removed SASL mech. Clearly removing plugin modules in a running system is asking for trouble anyway... Having said this I would not care too much about this list going to change... Ciao, Michael.

1 0

(ITS#6274) nssov: makefile suffix rules too greedy
by jonathan＠phillipoux.net 25 Aug '09

25 Aug '09

Full_Name: Jonathan Clarke Version: RE24 OS: Linux URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (82.67.204.30) When running "make" in the nssov module from contrib/slapd-modules, I get this: 8<------------------- [...]/contrib/slapd-modules/nssov$ make ../../../libtool --mode=compile gcc -g -O2 -I../../../include -I../../../include -I../../../servers/slapd -Inss-ldapd -c alias.c nssov.h libtool: compile: cannot determine name of library object from `nssov.h' 8<------------------- This trivial patch to Makefile corrects this: 8<------------------- - $(LIBTOOL) --mode=compile $(CC) $(OPT) $(DEFS) $(INCS) -c $? + $(LIBTOOL) --mode=compile $(CC) $(OPT) $(DEFS) $(INCS) -c $< 8<-------------------

1 0

Re: (ITS#6200) slapd crashes under load w/ syncrepl
by richton＠rci.rutgers.edu 24 Aug '09

24 Aug '09

What does this look with top and/or in dmesg over the run time? Is it a simple out-of-memory? Definitely a bit of a gross method, but what's ls -lh core show for size? (If so, is it warranted given your load or is there a leak, etc etc...and of course make sure you're up to date on the surrounding packages, OpenLDAP isn't the only thing that can leak.)

1 0

Re: (ITS#6270) Conflict between ppolicy (pwdReset flag) and unique overlays
by michael＠stroeder.com 24 Aug '09

24 Aug '09

michael(a)stroeder.com wrote: > clem.oudot(a)gmail.com wrote: >> I have a rootdn. An extract of my slapd.conf is : >> >> --- >> database bdb >> suffix dc=3Dexample,dc=3Dcom >> rootdn cn=3Dmanager,dc=3Dexample,dc=3Dcom >> rootpw secret >> directory /var/lib/ldap >> >> overlay ppolicy >> ppolicy_use_lockout >> ppolicy_hash_cleartext >> >> overlay unique >> unique_uri ldap:///ou=3Dusers,dc=3Dexample,dc=3Dcom?uid?sub?(objectClass=3D= >> inetOrgPerson) > > Could you please repost this without the broken quoted printables? Especially > the filter part of the value for 'unique_uri'. I suspect the ITS software messed this up because the direct Cc:-ed messages to me did not contain the messed up quoted printables. So here's what Clément orginally sent as excerpt of his config: --- database bdb suffix dc=example,dc=com rootdn cn=manager,dc=example,dc=com rootpw secret directory /var/lib/ldap overlay ppolicy ppolicy_use_lockout ppolicy_hash_cleartext overlay unique unique_uri ldap:///ou=users,dc=example,dc=com?uid?sub?(objectClass=inetOrgPerson) --- Ciao, Michael.

1 0

Re: (ITS#6270) Conflict between ppolicy (pwdReset flag) and unique overlays
by michael＠stroeder.com 24 Aug '09

24 Aug '09

clem.oudot(a)gmail.com wrote: > I have a rootdn. An extract of my slapd.conf is : > > --- > database bdb > suffix dc=3Dexample,dc=3Dcom > rootdn cn=3Dmanager,dc=3Dexample,dc=3Dcom > rootpw secret > directory /var/lib/ldap > > overlay ppolicy > ppolicy_use_lockout > ppolicy_hash_cleartext > > overlay unique > unique_uri ldap:///ou=3Dusers,dc=3Dexample,dc=3Dcom?uid?sub?(objectClass=3D= > inetOrgPerson) Could you please repost this without the broken quoted printables? Especially the filter part of the value for 'unique_uri'. Ciao, Michael.

1 0

(ITS#6273) NSS overlay (nssov) fails to load
by battery＠writeme.com 24 Aug '09

24 Aug '09

Full_Name: Matt Kassawara Version: 2.4.17 OS: Ubuntu 9.10 (Karmic) URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (8.7.94.151) Loading the nssov module using 'ldapadd' reports an olcModuleLoad handler error... # ldapadd -H ldapi:/// -Y external SASL/EXTERNAL authentication started SASL username: gidNumber=0+uidNumber=0,cn=peercred,cn=external,cn=auth SASL SSF: 0 dn: cn=module{2},cn=config objectclass: olcmodulelist cn: module{2} olcmoduleload: {0}nssov olcmodulepath: /usr/lib/ldap adding new entry "cn=module{2},cn=config" ldap_add: Other (e.g., implementation specific) error (80) additional info: <olcModuleLoad> handler exited with 1 Output from 'slapd' in debug level 7 (incorrectly) reports file not found... oc_check_required entry (cn=module{2},cn=config), objectClass "olcModuleList" oc_check_allowed type "objectClass" oc_check_allowed type "cn" oc_check_allowed type "olcModuleLoad" oc_check_allowed type "olcModulePath" oc_check_allowed type "structuralObjectClass" lt_dlopenext failed: (nssov) file not found Output from 'LD_DEBUG' reveals undefined symbol 'ber_bvmatch' in nssov.so.0... 18016: /usr/lib/ldap/nssov.so.0: error: symbol lookup error: undefined symbol: ber_bvmatch (fatal) 18016: 18016: file=/usr/lib/ldap/nssov.so.0 [0]; destroying link map

1 0

(ITS#6272) test045 freed memory access
by richton＠nbcs.rutgers.edu 24 Aug '09

24 Aug '09

Full_Name: Aaron Richton Version: RE24 OS: Solaris 9 URL: https://www.nbcs.rutgers.edu/~richton/richton-bt-200908221228.txt Submission from: (NULL) (128.6.31.135) t@5 (l@5) terminated by signal SEGV (no mapping at the fault address) Current function is connection_abandon 729 op.orn_msgid = o->o_msgid; (dbx) where current thread: t@5 =>[1] connection_abandon(c = 0x106df088), line 729 in "connection.c" [2] connection_closing(c = 0x106df088, why = 0x2859e0 "connection lost"), line 777 in "connection.c" [3] connection_read(s = 11, cri = 0xfd3ffd64), line 1427 in "connection.c" [4] connection_read_thread(ctx = 0xfd3ffe0c, argv = 0xb), line 1245 in "connection.c" [5] ldap_int_thread_pool_wrapper(xpool = 0x10491f08), line 685 in "tpool.c" (dbx) print o->o_hdr o->o_hdr = 0xdeadbeef Full backtrace in ITS link. testrun directory: https://www.nbcs.rutgers.edu/~richton/richton-testrun-200908221228.tar.bz2

1 0

Jump to page:

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

openldap-bugs August 2009