I'm not runing a 2.3 server for comparison, but is this a bug or is it expected behavior to have this sort of error?
In a program, which had constructed the ** due to a missing first name, it also was found to break the connection to the server...
CCC5:~> ldapsearch '(cn=** VERNON-GERSTENFELD*)' SASL/GSSAPI authentication started SASL username: aej@WPI.EDU SASL SSF: 56 SASL data security layer installed. # extended LDIF # # LDAPv3 # base <dc=WPI, dc=EDU> (default) with scope subtree # filter: (cn=** VERNON-GERSTENFELD*) # requesting: ALL #
# extended result response extended: 1.3.6.1.4.1.1466.20036 result: 2 Protocol error text: unexpected data in PDU
# numResponses: 1 # numExtended: 1 CCC5:~> ldapsearch -VV ldapsearch: @(#) $OpenLDAP: ldapsearch 2.4.7 (Dec 17 2007 09:23:50) $ aej@CCC1.WPI.EDU:/tools/src/openldap/RHEL4-i686/openldap-2.4.7/clients/tools (LDAP library: OpenLDAP 20407)
"Allan E. Johannesen" aej@WPI.EDU writes:
I'm not runing a 2.3 server for comparison, but is this a bug or is it expected behavior to have this sort of error?
In a program, which had constructed the ** due to a missing first name, it also was found to break the connection to the server...
CCC5:~> ldapsearch '(cn=** VERNON-GERSTENFELD*)'
[...] There should be only one asterisk and no space after a asterisk. See rfc 4515, String representation on search filters.
-Dieter
On Thu, 27 Dec 2007, Dieter Kluenter wrote:
"Allan E. Johannesen" aej@WPI.EDU writes:
...
In a program, which had constructed the ** due to a missing first name, it also was found to break the connection to the server...
CCC5:~> ldapsearch '(cn=** VERNON-GERSTENFELD*)'
[...] There should be only one asterisk and no space after a asterisk. See rfc 4515, String representation on search filters.
Umm, no. That filter matches the grammar in RFC 4515. The relevant grammar terms are:
filter = LPAREN filtercomp RPAREN filtercomp = and / or / not / item item = simple / present / substring / extensible substring = attr EQUALS [initial] any [final] any = ASTERISK *(assertionvalue ASTERISK) assertionvalue = valueencoding valueencoding = 0*(normal / escaped) normal = UTF1SUBSET / UTFMB UTF1SUBSET = %x01-27 / %x2B-5B / %x5D-7F
I.e., it's a substring match with no 'initial', no 'final', and two 'assertionvalue' parts, the first of which is empty (a 'valueencoding' can be zero length) and the second of which is " VERNON-GERSTENFELD".
So, if the text form of the filter is legal, what's breaking? Either the client is sending bad BER or the server is refusing good BER. It looks to me like the latter: the protocol's ASN.1 says that an 'any' substring is an OCTET STRING, which can be zero length, but the substring parser in slapd doesn't accept that: in get_ssa() it treats a zero length substring as an error: if ( value.bv_val == NULL || value.bv_len == 0 ) { rc = LDAP_INVALID_SYNTAX; goto return_error; }
I suggest that Allan file an ITS...
Philip Guenther
On Dec 27, 2007, at 10:26 AM, Philip Guenther wrote:
On Thu, 27 Dec 2007, Dieter Kluenter wrote:
"Allan E. Johannesen" aej@WPI.EDU writes:
...
In a program, which had constructed the ** due to a missing first name, it also was found to break the connection to the server...
CCC5:~> ldapsearch '(cn=** VERNON-GERSTENFELD*)'
[...] There should be only one asterisk and no space after a asterisk. See rfc 4515, String representation on search filters.
Umm, no. That filter matches the grammar in RFC 4515.
Note that there is no way one can represent an empty initial or empty final substrings. That's because empty substrings are nonsense. The filter (cn=**) is invalid as it represents an empty initial, an empty any, and an empty final. slapd(8) correctly errors on receipt of any empty substring.
One can argue that the LDAP ASN.1 and the LDAP filter ABNF should have size constraints on substring values. I consider it a minor bug in specification which should be fixed.
IMO, the bug is in ldapsearch(1). It should reject a filter with empty substrings.
-- Kurt
The relevant grammar terms are:
filter = LPAREN filtercomp RPAREN filtercomp = and / or / not / item item = simple / present / substring / extensible substring = attr EQUALS [initial] any [final] any = ASTERISK *(assertionvalue ASTERISK) assertionvalue = valueencoding valueencoding = 0*(normal / escaped) normal = UTF1SUBSET / UTFMB UTF1SUBSET = %x01-27 / %x2B-5B / %x5D-7F
I.e., it's a substring match with no 'initial', no 'final', and two 'assertionvalue' parts, the first of which is empty (a 'valueencoding' can be zero length) and the second of which is " VERNON-GERSTENFELD".
So, if the text form of the filter is legal, what's breaking? Either the client is sending bad BER or the server is refusing good BER. It looks to me like the latter: the protocol's ASN.1 says that an 'any' substring is an OCTET STRING, which can be zero length, but the substring parser in slapd doesn't accept that: in get_ssa() it treats a zero length substring as an error: if ( value.bv_val == NULL || value.bv_len == 0 ) { rc = LDAP_INVALID_SYNTAX; goto return_error; }
I suggest that Allan file an ITS...
Philip Guenther
Kurt Zeilenga wrote:
On Dec 27, 2007, at 10:26 AM, Philip Guenther wrote:
Umm, no. That filter matches the grammar in RFC 4515.
Note that there is no way one can represent an empty initial or empty final substrings. That's because empty substrings are nonsense. The filter (cn=**) is invalid as it represents an empty initial, an empty any, and an empty final. slapd(8) correctly errors on receipt of any empty substring.
One can argue that the LDAP ASN.1 and the LDAP filter ABNF should have size constraints on substring values. I consider it a minor bug in specification which should be fixed.
Size constraints would also fix this stupid wart in the grammar:
Note that although both the <substring> and <present> productions in the grammar above can produce the "attr=*" construct, this construct is used only to denote a presence filter.
IMO, the bug is in ldapsearch(1). It should reject a filter with empty substrings.
"kurt" == Kurt Zeilenga kurt@openldap.org writes:
kurt> IMO, the bug is in ldapsearch(1).
The bug was discovered in a program, not ldapsearch, and wouldn't have been noticed other than it reliably terminating the connection to the server, rather than letting the program continue.
The ldap_search_s() returned LDAP_SUCCESS and the ldap_first_entry was where the assert discovered the closed connection.
So, if this is only a bug in ldapsearch, all the programs that call ldap_search will need to not only check for LDAP_SUCCESS, but also for the LDAP * structure being closed.
On Thu, 27 Dec 2007, Howard Chu wrote:
Kurt Zeilenga wrote:
On Dec 27, 2007, at 10:26 AM, Philip Guenther wrote:
Umm, no. That filter matches the grammar in RFC 4515.
Note that there is no way one can represent an empty initial or empty final substrings. That's because empty substrings are nonsense. The filter (cn=**) is invalid as it represents an empty initial, an empty any, and an empty final. slapd(8) correctly errors on receipt of any empty substring.
If this has always been an error, then I'm surprised the restriction isn't mentioned at all---not even in prose---in either RFC 2251 or RFC 4511. Note that RFC 4517 _does_ ban empty 'any' parts in SubstringAssertion, so the issue was at least thought about in that context.
I would also contest the claim that empty substrings are nonsense. They may clearly be useless, but they are not logically inconsistent. Indeed, a server can trivially 'support' them by simply ignoring them whenever they appear. The lack of a restriction against them in the specs would imply that a server that fails to do so is non-compliant.
That said, if most servers throw an error on empty substrings, then clients can't send them and expect it to reliably work and (as you suggest) they should be removed from the protocol when it is revised. Perhaps an errata should be filed now.
Size constraints would also fix this stupid wart in the grammar:
Note that although both the <substring> and <present> productions in the grammar above can produce the "attr=*" construct, this construct is used only to denote a presence filter.
Doing _that_ would require more than size constraints on 'any'. To keep 'substring' from matching "attr=*" you need to make substring something like: substring = attr EQUALS substringpat substringpat = initial ASTERISK [any] [final] / ASTERISK any [final] / ASTERISK final initial = substringvalue any = 1*(substringvalue ASTERISK) final = substringvalue substringvalue = 1*(normal / escaped)
(I.e., break out the three cases of non-empty initial, any, and final.)
IMO, the bug is in ldapsearch(1). It should reject a filter with empty substrings.
You mean it's a bug in libldap, as that's what's doing the parsing.
I guess Allan needs to go ahead and change the program that's generating these searches to explicitly test for empty parts and suppress them when building the filter. (He should probably also check that it's doing RFC 2254 encoding of the values that it's inserting in the filters while he's at it. It's no good to catch an empty first-name field if the user can ask for a first-name of "*" and generate "(cn=***lastname*)". Generating filters isn't as trivial as one might wish.)
Philip Guenther
Kurt Zeilenga wrote:
On Dec 27, 2007, at 10:26 AM, Philip Guenther wrote:
On Thu, 27 Dec 2007, Dieter Kluenter wrote:
There should be only one asterisk and no space after a asterisk. See rfc 4515, String representation on search filters.
Umm, no. That filter matches the grammar in RFC 4515.
Note that there is no way one can represent an empty initial or empty final substrings. That's because empty substrings are nonsense. The filter (cn=**) is invalid as it represents an empty initial, an empty any, and an empty final. slapd(8) correctly errors on receipt of any empty substring.
One can argue that the LDAP ASN.1 and the LDAP filter ABNF should have size constraints on substring values. I consider it a minor bug in specification which should be fixed.
IMO, the bug is in ldapsearch(1). It should reject a filter with empty substrings.
Hmm. RCS file: /repo/OpenLDAP/pkg/ldap/libraries/libldap/filter.c,v ---------------------------- revision 1.5 date: 2002/01/02 22:03:37; author: kurt; state: Exp; lines: +2 -2 fix empty substrings any bug ---------------------------- diff -u -r1.4 -r1.5 --- filter.c 2 Jan 2002 19:31:21 -0000 1.4 +++ filter.c 2 Jan 2002 22:03:37 -0000 1.5 @@ -675,7 +675,7 @@
*nextstar++ = '\0';
- if ( *val != '\0' ) { + if ( *val != '\0' || ftype == LDAP_SUBSTRING_ANY ) { ber_slen_t len = ldap_pvt_filter_value_unescape( val );
if ( len < 0 ) {
So at some point, this was intended to be legal.
On Dec 27, 2007, at 9:16 PM, Howard Chu wrote:
So at some point, this was intended to be legal.
Or I just made the change to allow what RFC 4515 allows, without regard to whether allowing it made any sense.
I think it makes no sense for (foo=**) to be valid when (foo:substringsRule:=\2A\2A), substringsRule being foo's substrings rule, is invalid. It also makes no sense to allow empty any substrings but not allow empty initial and final substrings. A substring is a substring is a substring.
I also note that CN is of syntax directoryString, so (cn=**), if a valid assertion, might not evaluate to true for all (or any) values of CN, as one might expect it would.
-- Kurt
"ldapsoft" == Philip Guenther guenther+ldapsoft@sendmail.com writes:
ldapsoft> I guess Allan needs to go ahead and change the program that's ldapsoft> generating these searches to explicitly test for empty parts and ldapsoft> suppress them when building the filter. (He should probably also ldapsoft> check that it's doing RFC 2254 encoding of the values that it's ldapsoft> inserting in the filters while he's at it. It's no good to catch an ldapsoft> empty first-name field if the user can ask for a first-name of "*" ldapsoft> and generate "(cn=***lastname*)". Generating filters isn't as ldapsoft> trivial as one might wish.)
Yes, I put in a check for empty searches, but you've pointed out that the data might contain other characters which might break the search. The problem arose when processing
last-name, first-name
where it was written:
last-name,
Had the person left off the comma, none of this brouhaha would have come to light. The simple program assumed that if there was a comma, something must be after it. If the person wrote:
last-name, *
then it's bound for failure again despite my recent patch. Clearly, I need to make it more robust. It had been running for 5+ years without this trouble appearing.
Actually, I don't care much about a search failing, but the search failed so profoundly that it broke the connection to the server. Not having recovery of the connection was no problem for these years, either, but now all these deficiencies have come to light.
I figured if it didn't like the filter, it would just say so, rather than my having to replicate its entire parsing mechanism in any client program to avoid upsetting it.
openldap-software@openldap.org