On Thu, 20 Sep 2012, Emmanuel Dreyfus wrote:
When feeding a LDAP URI to ldap_url_parse(), I understand some characters may need to be escaped in filters in order to get a litteral:
- => \2a
( => \28 ) => \29 \ => \5c / => \2f
Reading the man page, I understand %-encoding is not mandatory, but it is of course required for ?, and obviously for %. ? -> %3F % -> %25
Are there other characters that should be %-encoded?
From RFC 4516, LDAP: Uniform Resource Locator, section 2.1:
An octet MUST be encoded using the percent-encoding mechanism described in section 2.1 of [RFC3986] in any of these situations:
The octet is not in the reserved set defined in section 2.2 of [RFC3986] or in the unreserved set defined in section 2.3 of [RFC3986].
It is the single Reserved character '?' and occurs inside a <dn>, <filter>, or other element of an LDAP URL. ...
From RFC 3986, URI Generic Syntax, section 2.2 and section 2.3:
reserved = gen-delims / sub-delims gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "=" unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
So, you have to precent-encode all non-graphical characters (0x00 through 0x20 and 0x7f though 0xff), as well as: " -> %22 % -> %25 < -> %3c > -> %3e ? -> %3f \ -> %5c ^ -> %5e ` -> %60 { -> %7b | -> %7c } -> %7d
Philip Guenther