Hi all,
Looking back in time to the definitions of ldap_get_values() and ldap_get_values_len(), we are told that "If the attribute values are binary in nature, and thus not suitable to be returned as an array of char *'s, the ldap_get_values_len() routine can be used instead."
This has been historically vague - first off, what happens if an attempt is made to call ldap_get_values() on binary data, do you get an error, or garbage data? The source isn't giving me a clear answer.
Second question is how do you know which of ldap_get_values() or ldap_get_values_len() to call? Obviously you can manually know this, but I'm interested in automated behaviour. What is the canonical way to discover that if you queried a jpegPhoto (for example) that the result would be binary?
To be clear, the end goal it to patch the following documentation to make this clear:
https://www.openldap.org/software/man.cgi?query=ldap_get_values&apropos=...
Regards, Graham --
On Wed, Apr 03, 2024 at 10:55:26AM +0100, Graham Leggett wrote:
Hi all,
Looking back in time to the definitions of ldap_get_values() and ldap_get_values_len(), we are told that "If the attribute values are binary in nature, and thus not suitable to be returned as an array of char *'s, the ldap_get_values_len() routine can be used instead."
This has been historically vague - first off, what happens if an attempt is made to call ldap_get_values() on binary data, do you get an error, or garbage data? The source isn't giving me a clear answer.
Hi Graham, in this case binary data means embedded NULs (\0) can be found: given that what you get back is a naive char * for each value, you stand to lose information about whether that NUL is part of the value or a string terminator.
Second question is how do you know which of ldap_get_values() or ldap_get_values_len() to call? Obviously you can manually know this, but I'm interested in automated behaviour. What is the canonical way to discover that if you queried a jpegPhoto (for example) that the result would be binary?
You either expect the data to be a string of some sort (no embedded NULs), then you're free to use whichever or you are prepared to accept arbitrary bytestreams and you need to use the one that returns bervals. That's all there is.
You're welcome to propose better wording if you can make it clearer to a reasonably competent C developer (I'm sure we can assume that they understand how strings are laid out etc.)
Regards,
On 03 Apr 2024, at 13:03, Ondřej Kuzník ondra@mistotebe.net wrote:
This has been historically vague - first off, what happens if an attempt is made to call ldap_get_values() on binary data, do you get an error, or garbage data? The source isn't giving me a clear answer.
Hi Graham, in this case binary data means embedded NULs (\0) can be found: given that what you get back is a naive char * for each value, you stand to lose information about whether that NUL is part of the value or a string terminator.
So that means garbage data is returned.
Second question is how do you know which of ldap_get_values() or ldap_get_values_len() to call? Obviously you can manually know this, but I'm interested in automated behaviour. What is the canonical way to discover that if you queried a jpegPhoto (for example) that the result would be binary?
You either expect the data to be a string of some sort (no embedded NULs), then you're free to use whichever or you are prepared to accept arbitrary bytestreams and you need to use the one that returns bervals. That's all there is.
So am I right in understanding there is no way to ask the server "what type is this attribute you just gave me, is this arbitrary octets or a NUL terminated string"?
You're welcome to propose better wording if you can make it clearer to a reasonably competent C developer (I'm sure we can assume that they understand how strings are laid out etc.)
The reason this matters has nothing to do with reasonably competent C developers, but rather options given to end users.
If the end user is allowed to provide an attribute in a configuration file, do I force the end user to know about binary values (as is common now), or is there a way I can be nice to the end user and have the system behave sensibly based on whether the return data is a string or binary?
Regards, Graham --
On Wed, Apr 03, 2024 at 02:08:15PM +0100, Graham Leggett wrote:
On 03 Apr 2024, at 13:03, Ondřej Kuzník ondra@mistotebe.net wrote:
This has been historically vague - first off, what happens if an attempt is made to call ldap_get_values() on binary data, do you get an error, or garbage data? The source isn't giving me a clear answer.
Hi Graham, in this case binary data means embedded NULs (\0) can be found: given that what you get back is a naive char * for each value, you stand to lose information about whether that NUL is part of the value or a string terminator.
So that means garbage data is returned.
Not completely, you just ignore the rest of the value if you use anything that's strlen()-based.
So am I right in understanding there is no way to ask the server "what type is this attribute you just gave me, is this arbitrary octets or a NUL terminated string"?
There's server schema, otherwise no.
Are you processing the values or just treating them as opaque data? If the latter, why do you care? If you're processing it, you should know whether you expect a (UTF-8, IA5, ...) string or something else.
Regardless, the ldap_get_values API is legacy, when things were expected to be just strings and I guess it makes some tasks easier for lazy(?) programmers. ldap_get_values_len gives you explicit information about the length of the data, enabling safer processing. Certainly doesn't stop you from handling strings - the .bv_vals are exactly what you'd obtain from ldap_get_values.
You're welcome to propose better wording if you can make it clearer to a reasonably competent C developer (I'm sure we can assume that they understand how strings are laid out etc.)
The reason this matters has nothing to do with reasonably competent C developers, but rather options given to end users.
If the end user is allowed to provide an attribute in a configuration file, do I force the end user to know about binary values (as is common now), or is there a way I can be nice to the end user and have the system behave sensibly based on whether the return data is a string or binary?
It is up to the developer how they intend to handle the returned data, see above. The user has no influence over this whatsoever. If the application lets the user to specify an arbitrary attribute, then it has to be written accordingly (even if it's only to check that strlen(.bv_val) == .bv_len), anything else is asking for trouble.
Regards,
Is there even a straightforward way in the protocol to get type information? If the protocol won't tell you, a client library can't tell you.
Jordan Brown wrote:
Is there even a straightforward way in the protocol to get type information? If the protocol won't tell you, a client library can't tell you.
Any client can retrieve the schema definition of any schema element using an LDAP Search request.
On 4/3/2024 10:22 AM, Howard Chu wrote:
Jordan Brown wrote:
Is there even a straightforward way in the protocol to get type information? If the protocol won't tell you, a client library can't tell you.
Any client can retrieve the schema definition of any schema element using an LDAP Search request.
I had thought that all of those schema definitions were server-specific. But I see that RFC 4512 standardizes them. Thanks.
So that would seem to be the answer to the question: if you want to know how to handle a particular data item, you need to query its schema.