Howard Chu writes:
[Pulling last line up front]
userPassword is a string of *octets* not *characters*...
This is backwards. That simply means anything can be stored there - so password charset policy, if any, is up to whoever stores userPassword values. As in fact RFC 4519 2.41 paragraph 2 says:
2.41. 'userPassword' (...) The application SHOULD prepare textual strings used as passwords by transcoding them to Unicode, applying SASLprep [RFC4013], and encoding as UTF-8. The determination of whether a password is textual is a local client matter.
userPassword is an octetString, therefore if the server does any type of character set conversion on its values it is Broken.
Which means it's also Broken if it hashes a Simple Bind password before comparing it with userPassword. Since OpenLDAP does anyway though, or if one used another attribute where that wasn't wrong bug, there's no formal reason why the hash function couldn't involve charset conversion.
Certainly it'd be an ugly hack, hopefully going away someday. And I'm not arguing to do it if it's avoidable. But ugly hacks are normal enough when one has to deal with a pre-existing mess, which certainly describes charset issues on many sites.
Also several SASL mechanisms do hash passwords, but this time according to the standard. And if I remember correctly, several specify UTF-8 passwords.
Clients should not do any hashing or encoding; they should use the PasswordModify exop and send a plaintext octetString.
That's perfect from the LDAP point of view, if the LDAP admin is in charge of the world - or at least of the site. But I was talking about the case where the source of userPassword in LDAP is not updates by the users, but another password store such as /etc/passwd, NIS or whatever.
And where the LDAP admin does not control how passwords are prepared before they are hashed there, so that LDAP must accommodate the quirks of that other password store.
IMO it is not the LDAP subsystem's job to worry about how that octetString was generated.
No, it's the LDAP admin's and/or the user's job. If it's the LDAP admin, he faces LDAP clients which send Bind passwords, a source of stored passwords, terminals and users with various character sets - and it's his job to get them to agree on what a password looks like.
E.g., if you're using a device whose input mechanism (keyboard, touchscreen, whatever) can only generate 7 bit ASCII characters, that's not our concern.
If you mean "our" as in OpenLDAP project, indeed not.
If your password was originally selected using 8 bit data, and your current keyboard cannot generate the relevant octets, you're SOL.
But if your password was originally stored using Latin-1 but you're now using a client which sends UTF-8 or vice versa, it may be possible for LDAP to help. Either by hacking the Bind password or by storing two password hashes in userPassword, one for each supported character set.