openldap-bugs October 2007

openldap-bugs@openldap.org

38 participants
215 discussions

Re: (ITS#4860) Sets' enhancement
by jclarke＠linagora.com 09 Oct '07

09 Oct '07

This is a multi-part message in MIME format. --------------080102080503050601010703 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit jclarke(a)linagora.com a écrit : > Pierangelo Masarati a écrit : >> Should be fixed now in HEAD/re24/re23. Please test. p. > > I've been testing (at last, sorry for the delay), and I've come across > another memory problem. Backtrace is below, and valgrind output is attached. Got this one: it was a double-free in sets.c occuring after a slap_set_join() with lset or rset empty - the non empty set was returned, and then freed, causing a double-free error or segfault. The patch attached corrects this problem on RE23 and HEAD for me and doesn't have any side effects on our test set. However, it may not be the "right" way - please correct if necessary! Your recent fixes have solved all the issues from our test cases we were encountering. Thank you very much for them. Jon -- Jonathan Clarke Cellule OSSA - Groupe LINAGORA 27 rue de Berri, 75008 Paris Tél: 01 58 18 68 28, fax: 01 58 18 68 29 http://www.linagora.com - http://www.08000linux.com --------------080102080503050601010703 Content-Type: text/x-patch; name="jonathan-clarke-071008.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="jonathan-clarke-071008.patch" --- servers/slapd/sets.c.orig 2007-10-08 18:20:08.000000000 +0200 +++ servers/slapd/sets.c 2007-10-08 18:22:29.000000000 +0200 @@ -261,11 +261,15 @@ } else { set = set_dup( cp, lset, SLAP_SET_LREF2REF( op_flags ) ); + /* set array reference has been copied - don't free */ + op_flags |= SLAP_SET_LREFVAL | SLAP_SET_LREFARR; break; } } else if ( j == 0 ) { set = set_dup( cp, rset, SLAP_SET_RREF2REF( op_flags ) ); + /* set array reference has been copied - don't free */ + op_flags |= SLAP_SET_RREFVAL | SLAP_SET_RREFARR; break; } --------------080102080503050601010703--

1 0

Re: (ITS#4940) libldap doesn't wait for server's TLS close_notify
by hyc＠symas.com 09 Oct '07

09 Oct '07

Philip Guenther wrote: > Anyway, this issue isn't worth keep an ITS open about, as it doesn't > actually cause failures or visible errors. I might someday chase down a > clean way of implementing this second option, but only after the much more > useful work of coming up with a reasonable API to let event-driven apps do > STARTTLS without blocking. Someday. Ah yes, Someday. Drop by this page when you get a chance: http://scratchpad.wikia.com/wiki/LDAP_C_API Some day it'll be more than just a single page... -- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

1 0

Re: (ITS#4940) libldap doesn't wait for server's TLS close_notify
by guenther+ldapdev＠sendmail.com 09 Oct '07

09 Oct '07

On Mon, 8 Oct 2007, Howard Chu wrote: ... >> There are a number of ways this can be handled: >> 1) change the client to wait until it sees the server's close_notify alert >> by >> replacing "SSL_shutdown( p->ssl );" in tls.c with the two lines: >> if (SSL_shutdown( p->ssl ) == 0) >> SSL_shutdown( p->ssl ); >> (I have confirmed that this works. As documented, the first call >> will return 1 if the server's close_notify has already been >> received, if not, the second call will block until it is received.) > > So if the server doesn't send one, the client will be stuck waiting forever? It would also unblock if the server closed the connection. One downside of this option that I thought of later is that it shifts the TCP CLOSE_WAIT state from the client to the server. Fixing that would add more complexity to the sockbuf layer than this entire change is worth. Having chatted with Kurt about this at the last IETF meeting and pondered failure modes, I'm no longer in favor of this option. >> 2) change the client to not bother to send a close_notify alert when >> it's just going to close() the connection; change the server to not >> send a close_notify if it didn't get one. <...> > > Sounds like a change in the SSL library, not something for us to worry > about. Since "send a close_notify alert" == "call SSL_shutdown() for the first time", it would be a change in how the SSL library was used by libldap. Anyway, this issue isn't worth keep an ITS open about, as it doesn't actually cause failures or visible errors. I might someday chase down a clean way of implementing this second option, but only after the much more useful work of coming up with a reasonable API to let event-driven apps do STARTTLS without blocking. Someday. Philip Guenther

1 0

Re: ITS#5174 openldap.schema entry not valid per RFC 4512
by ando＠sys-net.it 09 Oct '07

09 Oct '07

> The reason I reported it was that I wrote a parser for LDAP schema > based directly on the formal grammar, downloaded all the schema from > OpenLDAP as a test, and that was the only schema that broke, and only > on the one element. I did not have any trouble with the OpenLDAP > software accepting the schema. If you use a grammar builder (I used > ANTLR) rather than a hand-coded parser, being permissive makes it a > bit more complex. Allowing the elements to appear in any order would > be easy to parse for the RFC 4512 elements, because they happen to be > defined with markers ("NAME", "DESC" and so on) that makes them easy > to differentiate, but then enforcing at-most-once semantics on the > elements requires either an additional pass or some hand coded > predicates to check for duplicates in a data structure being built on- > the-fly. Thanks for the feedback. It's now fixed in HEAD/re24. p. Ing. Pierangelo Masarati OpenLDAP Core Team SysNet s.r.l. via Dossi, 8 - 27100 Pavia - ITALIA http://www.sys-net.it --------------------------------------- Office: +39 02 23998309 Mobile: +39 333 4963172 Email: pierangelo.masarati(a)sys-net.it ---------------------------------------

1 0

Re: ITS#5174 openldap.schema entry not valid per RFC 4512
by bhanafee＠gmail.com 09 Oct '07

09 Oct '07

p., Thanks for looking at it. I think the RFC 4234 parse of the elements in the RFC 4512 definition would be as a 'concatenation' (based on rule->elements->alternation->concatenation). To get "any order" behavior under RFC 4234, I think the elements in the definition would be separated by '/' characters, and all the elements would be enclosed in a group with a repetition element. There's an example in the "Generalized Time" definition (RFC 4517, section 3.3.13) that shows why the optional members must be in sequence by default. In that particular case, the [fraction] can't be allowed to appear before the [minute ...] part, or it would be ambiguous whether the final "3015" in "2007100807.53015" was part of the fraction or a minute and second that appears (for no good reason) after the fraction ".5" The reason I reported it was that I wrote a parser for LDAP schema based directly on the formal grammar downloaded all the schema from OpenLDAP as a test. That was the only schema that broke, and only on the one element. I did not have any trouble with the OpenLDAP software accepting the schema. If you use a grammar builder (I used ANTLR) rather than a hand-coded parser, being permissive makes it a bit more complex. Allowing the elements to appear in any order would be easy to parse for the RFC 4512 elements, because they happen to be defined with markers ("NAME", "DESC" and so on) that makes them easy to differentiate, but then enforcing at-most-once semantics on the elements requires either an additional pass or some hand coded predicates to check for duplicates in a data structure being built on-the-fly. -- Brian On 10/8/07, Pierangelo Masarati <ando(a)sys-net.it> wrote: > Not sure if ordering of optional sequence members is required by RFC > 4234, but the change you suggest sounds harmless. OpenLDAP software, in > this sense, is usually permissive in what is accepted and strict in what > is emitted. > > Thanks, p. > > > > Ing. Pierangelo Masarati > OpenLDAP Core Team > > SysNet s.r.l. > via Dossi, 8 - 27100 Pavia - ITALIA > http://www.sys-net.it > --------------------------------------- > Office: +39 02 23998309 > Mobile: +39 333 4963172 > Email: pierangelo.masarati(a)sys-net.it > --------------------------------------- > > >

1 0

Re: (ITS#4940) libldap doesn't wait for server's TLS close_notify
by hyc＠symas.com 09 Oct '07

09 Oct '07

guenther+ldapdev(a)sendmail.com wrote: > Full_Name: Philip Guenther > Version: 2.3.27 > OS: Linux and Solaris > URL: ftp://ftp.openldap.org/incoming/ > Submission from: (NULL) (64.58.1.252) > > > [I vaguely recall seeing a report of this issue in the archives of one of the > mailing lists, but I can no longer find the original.] > > If you trace the packets sent when you use, for example, ldapsearch against a > server on a different host, using either the -Z option to do TLS or using an > ldaps URI, you'll discover that the TCP connection is actually reset instead of > being closed cleanly: the client sends TCP RSTs in response to the server's > final packets. > > This is because libldap uses the following sequence when unbind a TLS or SSL > connection: > 1) send the unbind request (over the TLS or SSL layer) > 2) call SSL_shutdown(), sending the TLS close_notify alert > 3) call close() > > After receiving the close_notify alert from step (2), the server sends back its > own close_notify alert and then calls close(). However, because the client > didn't wait for the server's response before calling close() on its end, the > client's TCP stack considers the TCP connection to already be gone and responds > with the RST packets. This occurs with Linux and Solaris clients and probably > most other unices: the response to packets after a close() doesn't vary in my > experience. > > There are a number of ways this can be handled: > 1) change the client to wait until it sees the server's close_notify alert by > replacing "SSL_shutdown( p->ssl );" in tls.c with the two lines: > if (SSL_shutdown( p->ssl ) == 0) > SSL_shutdown( p->ssl ); > (I have confirmed that this works. As documented, the first call will return > 1 > if the server's close_notify has already been received, if not, the second > call > will block until it is received.) So if the server doesn't send one, the client will be stuck waiting forever? > 2) change the client to not bother to send a close_notify alert when it's just > going to close() the connection; change the server to not send a > close_notify > if it didn't get one. This probably violates the TLS spec, but the fact > that > TLS/1.1 permits resumption of sessions without close_notify having been sent > indicates that the violation is not a major issue, particularly given that > LDAP's > unbind request prevents truncation attacks. Close_notifies are, of course, > required if the client just wants to terminate the TLS layer and resume > unprotected LDAP operations. Sounds like a change in the SSL library, not something for us to worry about. > > 3) ignore the issue: it only causes one or two extra packets to be sent. While > it > also eliminates the TIME_WAIT state, LDAP's application-level close (the > unbind > request) means it doesn't need reliable full-duplex closure, so the only > concern > would be random connection issues from reincarnations of the TCP tuple, > which > is unlikely for an LDAP connection. > Personally, I like the simplicity and cleanliness of solution (1). (1) has the possibility of an indefinite hang. As such, I think it best to leave it with the current behavior. -- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

1 0

Re: (ITS#5171) hdb txn_checkpoint failures
by richton＠nbcs.rutgers.edu 08 Oct '07

08 Oct '07

> One more thing to check is just using "ls -l" to see if the actual size of > the log files corresponds with the db_stat offsets. E.g. if slave6 base1's > log.0000001 is really 8MB but the LSN is only 233KB, then we have to look for > a weird in-memory corruption. If not, then somebody reset your logs. No, it looks like those sizes all match. Actually, the "reset logs" may well be the case (although I still can't imagine how, I'm willing to just chalk this whole thing up to user error...of course logs show that the user was me, which is a shame :) and is hard to disprove (with only one log file active) with the exception of base2. base2 has multiple log files going back: [slave4] -rw------- 1 root root 9999986 Sep 6 18:03 log.0000000001 -rw------- 1 root root 9999967 Sep 10 14:03 log.0000000002 -rw------- 1 root root 9999983 Sep 18 16:33 log.0000000003 -rw------- 1 root root 9429761 Oct 8 05:33 log.0000000004 [slave6] -rw------- 1 root root 9999986 Sep 6 18:03 log.0000000001 -rw------- 1 root root 9999967 Sep 10 14:03 log.0000000002 -rw------- 1 root root 9999983 Sep 18 16:33 log.0000000003 -rw------- 1 root root 9429761 Oct 8 05:33 log.0000000004 which of course match the db_stat -l, but also extend back prior to September 24 according to the filesystem timestamps. I guess the argument could be made that log 4 was truncated on September 24...would that be detected/come up sane/come up bad in the db_stat?

1 0

(ITS#5177) refreshAndPersist race condition
by hyc＠OpenLDAP.org 08 Oct '07

08 Oct '07

Full_Name: Howard Chu Version: 2.3/2.4 OS: URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (76.168.84.21) Submitted by: hyc If heavy Add traffic is occurring while a refreshAndPersist consumer is refreshing, newly added entries may be omitted from the refresh phase without being queued into the Persist phase. reported by stelios.xx.grigoriadis(a)ericsson.com http://www.openldap.org/lists/openldap-bugs/200709/msg00172.html

1 0

Sync replication failure during startup.
by Stelios Grigoriadis 08 Oct '07

08 Oct '07

OpenLDAP v. 2.3.32 Berkeley DB 4.6 gcc 4.1.0 Replication doesn't work if the master server is started after the replica servers and a large amount of simoultaneous updates are performed while the server is starting up. The entries that didn't get replicated to the replicas will not be replicated even after a restart of both master and replicas. The contextCSN is set to a value larger than the entryCSN of the "lost" entries. This is what I think happens during a master server startup with simoultaneous updates ongoing (and replicas trying to sync in the initial phase). Suppose that two clients (Client1 and Client2) are adding the entries a and b respectively. If that happens between t1 and t2 (one second between) they will get the same entryCSN (same timestamp). If entry a is committed at tc1 and b at tc2, any replica search inbetween will only get the entry a. The entry b will be lost. Client1 entry=a, csn=x Client2 entry=b, csn=x Timeline ------+----------+---------+----+------> | | t1 | | t2=t1+1 | | tc1=entry a tc2=entry b committed committed Replica search query between tc1 and tc2. I don't know if a higher granularity would prevent this, or even better, to have some kind of a counter so that every modification gets a unique csn. Can you please comment on our analyzis to let us know if the analyzis is correct or if we have missed something important? Any help or hints on how to avoid or fix this problem is greatly appreciated. If I receive useful information direcly in private email, I will post a summary. Regards Stelios Grigoriadis

4 23

Re: (ITS#5171) hdb txn_checkpoint failures
by hyc＠symas.com 08 Oct '07

08 Oct '07

Aaron Richton wrote: >> It's still rather suspicious that slave4 and slave6 both had identical log >> status for base1 (1/188113) but different requested locations (1/8730339 vs >> 1/8730401). If they're identically configured slaves then they ought to be in >> lock-step. Then again, obviously they're not identical since slave6 doesn't >> show base4 in your log. > > Identical is relative. They've got the same OpenLDAP and supporting > binaries running on the same patches of Solaris 9 running identical > turn-up scripts with identical configuration files. But this is > production, so we've got data changes over time. For instance, the slaves > bootstrap with a slapadd -q, and the underlying slapcat could easily be > different from slave4 vs. slave6 (the most recent one is automatically > used). I'd imagine this would look different at the db layer, even once > syncrepl eventually converged the logical data? > >> Do you have the db_stat output from an uncorrupted slave? What about the >> master? > > Sure... https://www.nbcs.rutgers.edu/~richton/its5171_dbstatl2 Judging from the LSNs in use on these other servers, it sure looks like somebody went in and zeroed out your logs on slave4 and slave6. I don't think the environment spontaneously corrupted itself and reset the log offsets... One more thing to check is just using "ls -l" to see if the actual size of the log files corresponds with the db_stat offsets. E.g. if slave6 base1's log.0000001 is really 8MB but the LSN is only 233KB, then we have to look for a weird in-memory corruption. If not, then somebody reset your logs. -- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

1 0

← Newer
1
...
11
12
13
14
15
16
17
...
22
Older →

Jump to page:

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

openldap-bugs October 2007