Simon Josefsson wrote:
Howard Chuhyc@symas.com writes:
The recent trouble in ITS#5361 prompted me to look into the GnuTLS code a little deeper. It turns out that their corresponding set_subject_alt_name() API only takes a char * pointer as input, without a corresponding length. As such, this API will only work for string-form alternative names, and will typically break with IP addresses and other alternatives.
I've assigned this a ticket:
http://trac.gnutls.org/cgi-bin/trac.cgi/ticket/18
It seems most SAN's in wide use are string-like, so nobody had discovered this before. It seems clear gnutls should have a new API to allow setting arbitrary SAN values.
Not really trying to be antagonistic here, but this kind of proves my point about inexperience - SANs are pretty well documented in X.509v3. (Heck, if there's anything you can count on in ITU specs, it's that there are mountains of them.) Anyone who has spent any time with ASN.1 and X.500 knows that your safest, most uniform coding approach is to use explicit (length,data) tuples everywhere. Uniformity in the API design prevents this sort of problem ever occurring in the first place.
Your bug report text is interesting because it's only concerned with truncated binary values. There's a flip side to this problem that you didn't mention, which is the potential to SEGV from accessing unmapped memory due to use of a non-terminated string. And again, an experienced C programmer would understand this issue without needing to be bludgeoned over the head with it (as I am certainly beating you up about it now).
There's another issue here, due to usability. Anyone who has spent time with X.509 certificates has run into the problem of needing to specify multiple SANs to get their certificates working. Your current API only allows a single SAN value to be stored. Again, the API appears to have been designed by somebody who has never used these concepts in the real world, and doesn't understand how these items actually need to work. That's hardly reassuring in the general case, doubly worrying for a piece of security software.
Looking across more of their APIs, I see that the code makes liberal use of strlen and strcat, when it needs to be using counted-length data blobs everywhere. In short, the code is fundamentally broken; most of its external and internal APIs are incapable of passing binary data without mangling it. The code is completely unsafe for handling binary data, and yet the nature of TLS processing is almost entirely dependent on secure handling of binary data.
I strongly recommend that GnuTLS not be used. All of its APIs would need to be overhauled to correct its flaws and it's clear that the developers there are too naive and inexperienced to even understand that it's broken.
I looked at the X.509 API's (x509.h) and I couldn't find any other that didn't take buffer length arguments. I didn't look carefully though.
There is 1 (one) use of 'strcat' in the X.509 code, and it looks correct to me. There was 20 uses of 'strlen' in the X.509 code, and I went over the first matches but when they looked correct I didn't look further. (For reference, the X.509 code size is around 21000 lines of code.)
If you can give more details, that would be appreciated.
Aside from the inherently unsafe nature of using strlen() on strings of unknown origin, the overall code quality is extremely poor. It looks like it was written by a Pascal programmer, someone who's accustomed to strings with embedded lengths and for which strlen() is essentially free. I hesitate to mention this aspect of it since you'll probably consider it a premature optimization.
For example, minitasn1/coding.c: asn1_retCode _asn1_time_der (unsigned char *str, unsigned char *der, int *der_len) { int len_len; int max_len;
max_len = *der_len;
asn1_length_der (strlen (str), (max_len > 0) ? der : NULL, &len_len);
if ((len_len + (int) strlen (str)) <= max_len) memcpy (der + len_len, str, strlen (str)); *der_len = len_len + strlen (str);
if ((*der_len) > max_len) return ASN1_MEM_ERROR;
return ASN1_SUCCESS; }
There are 4 calls to strlen(str) here, which ... I can't even find the words to express how gross this is.
You note that there's really a small number of instances of strcat() in the code. That's true, but that's because you've provided your own _gnutls_str_cat() function instead, which is also heavily used. Assuming that strlen() isn't going to SEGV on you (which depends on dumb luck) this becomes just a question of efficiency.
E.g. in x509/dn.c you have in str_escape() str_length = MIN (strlen (str), buffer_size - 1); which is already bad because the MIN macro may evaluate that strlen() call twice.
This function is called from _gnutls_x509_parse_dn() which makes heavy use of _gnutls_string_append_str() which guess what, calls strlen() on its arguments again. All of your string processing code has O(n^2) efficiency. It's incredibly sloppy and there's no justifiable reason for it.
In x509/output.c your gnutls_x509_crt_print() function constructs a gnutls_string structure with a series of inefficient calls and then, the final capper: _gnutls_string_append_data (&str, "\0", 1); out->data = str.data; out->size = strlen (str.data); It walks all across the length of the string to append a single terminating NUL byte, and then it calls strlen() on the result *again*, even though str.length is already calculated for you in _gnutls_string_append_data().
If you actually had used gnutls_string structures everywhere, you *could* do all string construction operations in O(n) time (n ~ number of characters being added) by simply memcpy'ing to str.data+str.length. As it is, anything based on strcat is inherently O(n^2) and you do so many repeated traversals of the same data that I suspect the actual compute cost is even higher than n^2.