On Thu, 20 Sep 2012, Emmanuel Dreyfus wrote:
When feeding a LDAP URI to ldap_url_parse(), I understand some
characters
may need to be escaped in filters in order to get a litteral:
* => \2a
( => \28
) => \29
\ => \5c
/ => \2f
Reading the man page, I understand %-encoding is not mandatory, but
it is of course required for ?, and obviously for %.
? -> %3F
% -> %25
Are there other characters that should be %-encoded?
From RFC 4516, LDAP: Uniform Resource Locator, section 2.1:
An octet MUST be encoded using the percent-encoding mechanism
described in section 2.1 of [RFC3986] in any of these situations:
The octet is not in the reserved set defined in section 2.2 of
[RFC3986] or in the unreserved set defined in section 2.3 of
[RFC3986].
It is the single Reserved character '?' and occurs inside a <dn>,
<filter>, or other element of an LDAP URL.
...
From RFC 3986, URI Generic Syntax, section 2.2 and section 2.3:
reserved = gen-delims / sub-delims
gen-delims = ":" / "/" / "?" / "#" /
"[" / "]" / "@"
sub-delims = "!" / "$" / "&" / "'"
/ "(" / ")"
/ "*" / "+" / "," / ";" /
"="
unreserved = ALPHA / DIGIT / "-" / "." / "_" /
"~"
So, you have to precent-encode all non-graphical characters (0x00 through
0x20 and 0x7f though 0xff), as well as:
" -> %22
% -> %25
< -> %3c
-> %3e
? -> %3f
\ -> %5c
^ -> %5e
` -> %60
{ -> %7b
| -> %7c
} -> %7d
Philip Guenther