masarati@aero.polimi.it wrote:
I recently hit a pretty long certificate list with what appears to be crap past the end of its valid portion. I have no indication about how this was generated, but it is supposed to be in production within a CA, initially using a release of OpenLDAP without detailed CL validation in place (remember this was released in 2.4). I'm not posting this to the ITS because it's data I'm not allowed to disclose.
To make a long story short, I got the CL in LDIF format; I could convert it to DER and have openssl crl play with it. Apparently, openssl crl recognizes it and deals with its contents correctly, but our CL validator fails because when it expects to be at the end there is still stuff to be parsed (some 40KB of what appears to be garbage). Howard found a small issue in CL validation and fixed it (schema_init.c 1.459 -> 1.460), but nevertheless the issue remains. Howard also discovered that regenerating the CL in DER form using openssl clr would yield a shorter certificate that passes OpenLDAP's validator.
I'm raising it here because we need to understand how important it is for us to be able to deal with broken CL, and how broken we can accept them to be. In this case, the CL looks fine until the end, with garbage at the end. This could be tolerated. Or, we could just ignore any type of error, as soon as we don't need to deal with its contents (slapd is merely acting as a container, and needs not know whether it's containing good or bad data). This latter argument may be not valid as soon as our slapd takes over as much certificate handling as possible, performing certificate validation internally rather than delegating it to some external package (I understand Howard would probably like to follow that path, eventually).
Yes.
In general, the codebase we inherited from UMich has tended to treat everything as BLOBs (or worse yet, ASCII strings). In my opinion that completely undermines the point of ASN.1; we should make it a priority to properly parse everything according to its ASN.1 definition as a precursor to promoting more application-specific syntaxes down the road. E.g., I think it's stupid that most LDAP schemas use DirectoryString for everything, with some associated comments (not visible in the schema definition) describing the actual intended values for an attribute. LDAP needs proper syntaxes for URLs, IP addresses, email addresses, etc. etc. etc. and we need to get everyone out of the mindset of "everything is a string of octets and we don't know what's inside".
Of course we don't need to go overboard; we don't need to do a full JPEG decompression on every incoming jpegPhoto attribute. But we probably ought to check the first few octets for the signature/magic numbers...
But in the case of objects that are part of the actual X.500 spec (such as certificates and certificate lists) we will typically need to have deep knowledge of their structure (at least to extract the relevant DNs) and there's no excuse for a directory software package not to understand these aspects of the directory specification.
Unless there is strong opposition, I'd relax the last check about being at the end of the CL, in order to accept CL with this type of brokenness, possibly logging about the issue.
I guess it would be OK to log the issue and return success. One other check we might do instead, is to remember the initial Sequence length, and make sure we've hit the proper end-of-sequence location.