Hallvard B Furuseth wrote:
Howard Chu writes:
[Pulling last line up front]
userPassword is a string of *octets* not *characters*...
This is backwards.
No.
That simply means anything can be stored there
Yes. I could use a 16 byte binary UUID; the server has no responsibility or preference here.
- so password charset policy, if any, is up to whoever stores
userPassword values. As in fact RFC 4519 2.41 paragraph 2 says:
2.41. 'userPassword' (...) The application SHOULD prepare textual strings used as passwords by transcoding them to Unicode, applying SASLprep [RFC4013], and encoding as UTF-8. The determination of whether a password is textual is a local client matter.
And again - client matter, not server.
userPassword is an octetString, therefore if the server does any type of character set conversion on its values it is Broken.
Which means it's also Broken if it hashes a Simple Bind password before comparing it with userPassword.
Yes. And the fact that RFC2307 made this common practice doesn't change the fact that it's broken.
Since OpenLDAP does anyway though, or if one used another attribute where that wasn't wrong bug, there's no formal reason why the hash function couldn't involve charset conversion.
Again, making things more broken is not a good idea. If we're touching code, it should be to make it *less* broken.
Certainly it'd be an ugly hack, hopefully going away someday. And I'm not arguing to do it if it's avoidable. But ugly hacks are normal enough when one has to deal with a pre-existing mess, which certainly describes charset issues on many sites.
Accomodating pre-existing messes just allows them to perpetuate forever. Adding systems to the mess only increases the mess. At some point you have to say "No more." Otherwise your job as an admin quickly grows beyond any possibility of control.
Also several SASL mechanisms do hash passwords, but this time according to the standard. And if I remember correctly, several specify UTF-8 passwords.
What SASL does is not relevant here; what SASL hands to LDAP is opaque data.
IMO it is not the LDAP subsystem's job to worry about how that octetString was generated.
No, it's the LDAP admin's and/or the user's job. If it's the LDAP admin, he faces LDAP clients which send Bind passwords, a source of stored passwords, terminals and users with various character sets - and it's his job to get them to agree on what a password looks like.
E.g., if you're using a device whose input mechanism (keyboard, touchscreen, whatever) can only generate 7 bit ASCII characters, that's not our concern.
If you mean "our" as in OpenLDAP project, indeed not.
If your password was originally selected using 8 bit data, and your current keyboard cannot generate the relevant octets, you're SOL.
But if your password was originally stored using Latin-1 but you're now using a client which sends UTF-8 or vice versa, it may be possible for LDAP to help. Either by hacking the Bind password or by storing two password hashes in userPassword, one for each supported character set.
That is certainly your option, since userPassword is multivalued. But the client would have to use a regular ldapmodify op to achieve this, since the original character string only exists consistently on the client.
If you really want to fix this, it will take more than the current LDAPv3 spec work has accomplished so far. E.g., you need a new attributeType ("userCredential" perhaps) that you define to be a directory string, not an octet string, so that it's understood that character-set semantics are significant. (And, since character strings in LDAP must be UTF-8, you eliminate that ambiguity permanently. You also eliminate the possibility of using arbitrary octet sequences though, which may leave yet another group of users out in the cold.) You probably use tags to differentiate hash types, so that you don't require clients or servers to muck with the actual password attribute values. (E.g., userCredential;crypt: xyzzy)