guenther+ldapdev(a)sendmail.com wrote:
> Full_Name: Philip Guenther
> Version: 2.3.27
> OS: linux and solaris
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (64.58.1.252)
>
>
> The description of the TLS_REQCERT setting in the ldap.conf(5) manpage does not
> match the actual operation of the code. In particular:
> - clients don't 'request' server certs in TLS. They get one if the cipher
> suite
> uses them, otherwise they don't
> - 'allow' checks the identity of the server vs its cert (per RFC 4513,
> section 3.1.3) and will terminate the connection if they don't match
> - 'try' is the same as 'demand' and 'hard'
Not quite. With both "allow" and "try" it's OK if the server provides no
certificate. The difference is, with "try", if a cert is provided, it
must be valid.
>
>
> Here's a possible patch to ldap.conf.5 to fix the above. A reference to the RFC
> should perhaps be added to the text. I was also tempted to add a sentence to
> the lead-in to clarify that the setting has no effect if the negotiated cipher
> suite doesn't use certs, as a clarification of the "if any" in the existing
> lead-in, but that's minor. Simply having an even slightly correct description
> of 'allow' is the important thing.
>
> --- ldap.conf.5 26 Jan 2006 05:57:49 -0000
> +++ ldap.conf.5 30 Apr 2007 08:39:53 -0000
> @@ -249,22 +249,20 @@
> .RS
> .TP
> .B never
> -The client will not request or check any server certificate.
> +The client will not check the server certificate at all.
> .TP
> .B allow
> -The server certificate is requested. If no certificate is provided,
> -the session proceeds normally. If a bad certificate is provided, it will
> -be ignored and the session proceeds normally.
> -.TP
> -.B try
> -The server certificate is requested. If no certificate is provided,
> -the session proceeds normally. If a bad certificate is provided,
> +The client will only verify that name used to connect to the server
> +matches one of the server certificate's subjectAltName or CN values.
> +If no match is found, the session is immediately terminated.
> +.TP
> +.B try | demand | hard
> +These keywords are equivalent.
> +The client will verify the server certificate is valid and matches the
> +name used to connect (as for 'allow').
> +If a bad or mismatched certificate is provided,
> the session is immediately terminated.
> -.TP
> -.B demand | hard
> -These keywords are equivalent. The server certificate is requested. If no
> -certificate is provided, or a bad certificate is provided, the session
> -is immediately terminated. This is the default setting.
> +This is the default setting.
> .RE
> .TP
> .B TLS_CRLCHECK <level>
>
>
>
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Full_Name: Philip Guenther
Version: 2.3.27
OS: linux and solaris
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (64.58.1.252)
The description of the TLS_REQCERT setting in the ldap.conf(5) manpage does not
match the actual operation of the code. In particular:
- clients don't 'request' server certs in TLS. They get one if the cipher
suite
uses them, otherwise they don't
- 'allow' checks the identity of the server vs its cert (per RFC 4513,
section 3.1.3) and will terminate the connection if they don't match
- 'try' is the same as 'demand' and 'hard'
Here's a possible patch to ldap.conf.5 to fix the above. A reference to the RFC
should perhaps be added to the text. I was also tempted to add a sentence to
the lead-in to clarify that the setting has no effect if the negotiated cipher
suite doesn't use certs, as a clarification of the "if any" in the existing
lead-in, but that's minor. Simply having an even slightly correct description
of 'allow' is the important thing.
--- ldap.conf.5 26 Jan 2006 05:57:49 -0000
+++ ldap.conf.5 30 Apr 2007 08:39:53 -0000
@@ -249,22 +249,20 @@
.RS
.TP
.B never
-The client will not request or check any server certificate.
+The client will not check the server certificate at all.
.TP
.B allow
-The server certificate is requested. If no certificate is provided,
-the session proceeds normally. If a bad certificate is provided, it will
-be ignored and the session proceeds normally.
-.TP
-.B try
-The server certificate is requested. If no certificate is provided,
-the session proceeds normally. If a bad certificate is provided,
+The client will only verify that name used to connect to the server
+matches one of the server certificate's subjectAltName or CN values.
+If no match is found, the session is immediately terminated.
+.TP
+.B try | demand | hard
+These keywords are equivalent.
+The client will verify the server certificate is valid and matches the
+name used to connect (as for 'allow').
+If a bad or mismatched certificate is provided,
the session is immediately terminated.
-.TP
-.B demand | hard
-These keywords are equivalent. The server certificate is requested. If no
-certificate is provided, or a bad certificate is provided, the session
-is immediately terminated. This is the default setting.
+This is the default setting.
.RE
.TP
.B TLS_CRLCHECK <level>
Full_Name: Philip Guenther
Version: 2.3.27
OS: Linux and Solaris
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (64.58.1.252)
[I vaguely recall seeing a report of this issue in the archives of one of the
mailing lists, but I can no longer find the original.]
If you trace the packets sent when you use, for example, ldapsearch against a
server on a different host, using either the -Z option to do TLS or using an
ldaps URI, you'll discover that the TCP connection is actually reset instead of
being closed cleanly: the client sends TCP RSTs in response to the server's
final packets.
This is because libldap uses the following sequence when unbind a TLS or SSL
connection:
1) send the unbind request (over the TLS or SSL layer)
2) call SSL_shutdown(), sending the TLS close_notify alert
3) call close()
After receiving the close_notify alert from step (2), the server sends back its
own close_notify alert and then calls close(). However, because the client
didn't wait for the server's response before calling close() on its end, the
client's TCP stack considers the TCP connection to already be gone and responds
with the RST packets. This occurs with Linux and Solaris clients and probably
most other unices: the response to packets after a close() doesn't vary in my
experience.
There are a number of ways this can be handled:
1) change the client to wait until it sees the server's close_notify alert by
replacing "SSL_shutdown( p->ssl );" in tls.c with the two lines:
if (SSL_shutdown( p->ssl ) == 0)
SSL_shutdown( p->ssl );
(I have confirmed that this works. As documented, the first call will return
1
if the server's close_notify has already been received, if not, the second
call
will block until it is received.)
2) change the client to not bother to send a close_notify alert when it's just
going to close() the connection; change the server to not send a
close_notify
if it didn't get one. This probably violates the TLS spec, but the fact
that
TLS/1.1 permits resumption of sessions without close_notify having been sent
indicates that the violation is not a major issue, particularly given that
LDAP's
unbind request prevents truncation attacks. Close_notifies are, of course,
required if the client just wants to terminate the TLS layer and resume
unprotected LDAP operations.
3) ignore the issue: it only causes one or two extra packets to be sent. While
it
also eliminates the TIME_WAIT state, LDAP's application-level close (the
unbind
request) means it doesn't need reliable full-duplex closure, so the only
concern
would be random connection issues from reincarnations of the TCP tuple,
which
is unlikely for an LDAP connection.
Personally, I like the simplicity and cleanliness of solution (1).
Full_Name: Ceri Davies
Version: HEAD
OS: FreeBSD
URL: http://shrike.submonkey.net/~ceri/clients-tools-common.c.patch
Submission from: (NULL) (81.106.128.65)
The help output from the client tools contains a misspelling of "Identifier".
See the patch at http://shrike.submonkey.net/~ceri/clients-tools-common.c.patch
which applies against today's -HEAD.
I seriously doubt that you need this, but:
I, Ceri Davies, hereby place the following modifications to OpenLDAP Software
(and only these modifications) into the public domain. Hence, these
modifications may be freely used and/or redistributed for any purpose with or
without attribution and/or other notice.
----- quanah(a)zimbra.com wrote:
> ----- richton(a)nbcs.rutgers.edu wrote:
> > I don't have #5 (sleepycat#14657) nor the unofficial
> >
> http://www.stanford.edu/services/directory/openldap/configuration/patches/d…
> >
> > patch. As for the official one, I'm not sure about its relevance to
> > the
> > actual SEGV due to the "recovery...fail" comment. In other words,
> > though
> > it may be impacting the ability of alock/db_recover to do its thing,
>
> > that's just a side effect of the unclean shutdown which is the real
> > bug
> > here to my view.
>
>
>
>
> Patch #5 specifically deals with a race condition where a checkpoint
> is occuring while a cache buffer retrieval is also occuring causing a
> database corruption that will later not be recoverable from. At
> least, that's how I read sleepcat's description:
>
> 5. Fix a bug where cache buffer retrieval could race with a checkpoint
> call, potentially causing database environment recovery to fail.
> [#14657]
>
> Given that OpenLDAP checkpoints on shutdown, shutting down the server
> could be what is triggering the issue for you. I'd suggest applying
> the patch and seeing if this resolves your problem.
Just to note, I shut down one of my 2.3.35 servers that's served out over 1 million connections since I brought it up, and everything was clean on both shutdown and startup.
--Quanah
----- richton(a)nbcs.rutgers.edu wrote:
> I don't have #5 (sleepycat#14657) nor the unofficial
> http://www.stanford.edu/services/directory/openldap/configuration/patches/d…
>
> patch. As for the official one, I'm not sure about its relevance to
> the
> actual SEGV due to the "recovery...fail" comment. In other words,
> though
> it may be impacting the ability of alock/db_recover to do its thing,
> that's just a side effect of the unclean shutdown which is the real
> bug
> here to my view.
Patch #5 specifically deals with a race condition where a checkpoint is occuring while a cache buffer retrieval is also occuring causing a database corruption that will later not be recoverable from. At least, that's how I read sleepcat's description:
5. Fix a bug where cache buffer retrieval could race with a checkpoint call, potentially causing database environment recovery to fail. [#14657]
Given that OpenLDAP checkpoints on shutdown, shutting down the server could be what is triggering the issue for you. I'd suggest applying the patch and seeing if this resolves your problem.
> The region size patch is interesting, but I will tell you that the
> database in question has
>
> set_cachesize 0 200000000 0
>
> and it (to a glance) looks like that only impacts the gig column,
> which I
> have as zero anyway.
Yeah, the patch may not apply for you (I have a 3.5GB cache, so it does for me). Wouldn't harm anything, of course, if you decided later you needed a larger BDB cache. ;)
--Quanah
--
Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc
--------------------
Zimbra :: the leader in open source messaging and collaboration
I don't have #5 (sleepycat#14657) nor the unofficial
http://www.stanford.edu/services/directory/openldap/configuration/patches/d…
patch. As for the official one, I'm not sure about its relevance to the
actual SEGV due to the "recovery...fail" comment. In other words, though
it may be impacting the ability of alock/db_recover to do its thing,
that's just a side effect of the unclean shutdown which is the real bug
here to my view.
The region size patch is interesting, but I will tell you that the
database in question has
set_cachesize 0 200000000 0
and it (to a glance) looks like that only impacts the gig column, which I
have as zero anyway.
I can tell you that stop/starts weren't an issue with 2.3.32 and the same
Sleepycat binaries...not that I stop/start often as a rule of thumb. (I am
lately; we're implementing ando's {RADIUS} module.) But two identical
traces on two different boxes caught my eye.
On Thu, 26 Apr 2007, Quanah Gibson-Mount wrote:
>
> ----- richton(a)nbcs.rutgers.edu wrote:
>> Full_Name: Aaron RIchton
>> Version: 2.3.35
>> OS: Solaris 9
>> URL: ftp://ftp.openldap.org/incoming/
>> Submission from: (NULL) (128.6.30.206)
>>
>>
>> BDB 4.2.52. I've had a couple (different) machines SEGV on slapd
>> shutdown. Both
>> had identical stack traces:
>
> Wierd, I've been running hdb on my servers for nearly a year without such an issue. Did this just start with 2.3.35?
>
> Also, what patches do you have applied to BDB 4.2.52. I'm up to 6 now.
>
> <http://www.stanford.edu/services/directory/openldap/configuration/bdb-build…>
>
> 5 are direct from sleepycat:
>
> <http://www.oracle.com/technology/products/berkeley-db/db/update/4.2.52/patc…>
>
> with the last one there possibly impacting you if you don't have it?
>
> --Quanah
>
> --
> Quanah Gibson-Mount
> Principal Software Engineer
> Zimbra, Inc
> --------------------
> Zimbra :: the leader in open source messaging and collaboration
>
>
----- richton(a)nbcs.rutgers.edu wrote:
> Full_Name: Aaron RIchton
> Version: 2.3.35
> OS: Solaris 9
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (128.6.30.206)
>
>
> BDB 4.2.52. I've had a couple (different) machines SEGV on slapd
> shutdown. Both
> had identical stack traces:
Wierd, I've been running hdb on my servers for nearly a year without such an issue. Did this just start with 2.3.35?
Also, what patches do you have applied to BDB 4.2.52. I'm up to 6 now.
<http://www.stanford.edu/services/directory/openldap/configuration/bdb-build…>
5 are direct from sleepycat:
<http://www.oracle.com/technology/products/berkeley-db/db/update/4.2.52/patc…>
with the last one there possibly impacting you if you don't have it?
--Quanah
--
Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc
--------------------
Zimbra :: the leader in open source messaging and collaboration
Full_Name: Aaron RIchton
Version: 2.3.35
OS: Solaris 9
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (128.6.30.206)
BDB 4.2.52. I've had a couple (different) machines SEGV on slapd shutdown. Both
had identical stack traces:
current thread: t@1
=>[1] __dbreg_revoke_id(dbp = 0xa498818, have_lock = 0, force_id = -1), line 427
in "dbreg.c"
[2] __dbreg_close_files(dbenv = 0x405320), line 206 in "dbreg_util.c"
[3] __log_dbenv_refresh(dbenv = 0x405320), line 744 in "log.c"
[4] __dbenv_refresh(dbenv = 0x405320, orig_flags = 0, rep_check = 0), line 648
in "env_open.c"
[5] __dbenv_close(dbenv = 0x405320, rep_check = 0), line 579 in "env_open.c"
[6] __dbenv_close_pp(dbenv = 0x405320, flags = 0), line 534 in "env_open.c"
[7] hdb_db_close(be = 0x367340), line 517 in "init.c"
[8] backend_shutdown(be = 0x367340), line 351 in "backend.c"
[9] slap_shutdown(be = (nil)), line 279 in "init.c"
[10] main(argc = 4, argv = 0xffbffd6c), line 870 in "main.c"
No idea if this is Sleepycat or slapd, but it dirties the database in a way that
automatic nor command-line db_recover appreciate:
Ignoring log file: log.0000000007: magic number 0, not 40988
Invalid log file: log.0000000007: Invalid argument
PANIC: Invalid argument
PANIC: DB_RUNRECOVERY: Fatal error, run database recovery
hdb_db_open: Database cannot be recovered, err -30978. Restore from backup!
DB_ENV->lock_id_free interface requires an environment configured for the
locking subsystem
txn_checkpoint interface requires an environment configured for the transaction
subsystem
bdb_db_close: txn_checkpoint failed: Invalid argument (22)
backend_startup_one: bi_db_open failed! (-30978)
h.b.furuseth(a)usit.uio.no wrote:
> Full_Name: Hallvard B Furuseth
> Version: RE23
> OS: Linux
> URL:
> Submission from: (NULL) (129.240.202.105)
> Submitted by: hallvard
>
>
> overlay chain breaks with databases ldbm and ldif:
>
> ./run -b {ldbm or ldif} test032-chain outputs:
> ...
> Starting second slapd on TCP/IP port 9012...
> ...
> Comparing "cn=Mark Elliot,ou=Alumni Association,ou=People,dc=example,dc=com"
> on port 9012...
> ldapcompare failed (10)!
>
> ldapcompare output in testrun/test.out:
> Compare Result: Referral (10)
> Matched DN: ou=People,dc=example,dc=com
> Referral: ldap://localhost:9011/ou=People,dc=example,dc=com
> UNDEFINED
>
> testrun/slapd.1.log says the Compare was sent to the chained server, but
> with baseDN=<Matched DN above>:
> ...
> do_compare: dn (ou=People,dc=example,dc=com) attr (cn) value (mark elliot)
> conn=5 op=2 CMP dn="ou=People,dc=example,dc=com" attr="cn"
> ...
> conn=5 op=2 RESULT tag=111 err=16 text=
> ...
>
> Slapcat says the Matched DN entry is correct (same as in a BDB run):
> dn: ou=People,dc=example,dc=com
> objectClass: referral
> objectClass: extensibleObject
> ou: People
> ref: ldap://localhost:9011/ou=People,dc=example,dc=com
> structuralObjectClass: referral
>
>
> Also the Statslog output is wrong for all three databases:
>
> With LDBM and LDIF, the CMP operation is not logged, only its result:
> grep '^conn=4' testrun/slapd.2.log # the ldapcompare operation above
> conn=4 fd=14 ACCEPT from IP=127.0.0.1:45280 (IP=127.0.0.1:9012)
> conn=4 op=0 BIND dn="" method=128
> conn=4 op=0 RESULT tag=97 err=0 text=
> conn=4 op=1 RESULT tag=111 err=10 text=
> conn=4 op=2 UNBIND
> conn=4 fd=14 closed
>
> With BDB, the CMP operation is not logged, but both the returned
> result and the suppressed referral result is logged:
> conn=4 fd=12 ACCEPT from IP=127.0.0.1:42190 (IP=127.0.0.1:9012)
> conn=4 op=0 BIND dn="" method=128
> conn=4 op=0 RESULT tag=97 err=0 text=
> conn=4 op=1 RESULT tag=111 err=6 text=
> conn=4 op=1 RESULT tag=111 err=10 text=
> conn=4 op=2 UNBIND
> conn=4 fd=12 closed
>
>
> I haven't tried other databases, but since two different databases
> had the same bug and same wrong Statlog, I'm guessing others do too.
This seems to be a known limitation, see the FIXME in slapd/compare.c,
near SLAP_COMPARE_IN_FRONTEND. At least, that explains the problem for
back-ldif, which does not provide its own compare entry point. As for
back-ldbm, there is a missing referral_rewrite call in
back-ldbm/referral.c. I don't consider it worthwhile to fix back-ldbm,
but you're welcome to fix it if you wish. Just have a look at
back-bdb/referral.c for the correct code.
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/