Wandering Thoughts archives

2009-09-18

Are you sure it's a C string?

Here's a recent Ubuntu security alert about KDE:

It was discovered that KDE did not properly handle certificates with NULL characters in the Subject Alternative Name field of X.509 certificates. An attacker could exploit this to perform a man in the middle attack to view sensitive information or alter encrypted communications.

Let me translate this for you:

KDE put a text field in a C string without making sure that it actually was a C string. It turns out that it wasn't.

This is not just KDE's problem; people make this mistake over and over. They see something that the protocol documentation or the format documentation calls text or a string or something like that, and think 'I know, I can use a C string to hold it'. This is usually a bad mistake, one that leads to obscure bugs (if you're lucky) and security holes (if you're not).

Just because C strings are called 'strings' and fields in some network protocol or storage format are also called 'strings' or 'text' does not mean that the two are the same thing. Not even close, because C strings can't represent arbitrary byte sequences. Before you ever, ever use a C string to represent such a field, you must make absolutely sure that it cannot (validly) contain NULL bytes. And even then you must make your conversion code actually check this and declare things invalid if a string or text or whatever field turns out to have a 0 byte after all.

(In this modern world of internationalization it is not enough to be assured that the field cannot contain NULL characters, unless you are sure that the writeup really means 'bytes'; in certain encodings, such as UTF-16, non-NULL characters can contain NULL bytes.)

Better yet, never even try to use a C string to hold such fields in the first place. Use an explicit-length buffer implementation of some sort to hold the data, never use C string functions on it, and be safe no matter what. (Okay, you get to worry about buffer length variable overflows. Or, hopefully, the author of your buffer library has done that for you. Find one that's had a security audit for length overflows and signedness issues.)

(While I like bstring, I don't know if it's been audited carefully.)

Sidebar: what the bug (and attack) is

The general bug that this mistake creates is that your program sees a different and shorter field value than the real one. If the actual field is, say, 'CN=www.a.com\0attack.org' (with \0 standing in for the NULL byte), your program sees the field as 'CN=www.a.com'. The attacker exploits this to slip what is actually a dangerous field value (as your program sees it) past correctly written checking code.

This exact trick is the general SSL certificate attack. It turns out that at least some Certificate Authorities are (or were) perfectly happy to sign a certificate with hostnames (and other fields) that had embedded NULL bytes; when misinterpreted by buggy programs, these certificates are accepted as being for www.a.com instead of their real hostname. Add some DNS or routing manipulation and an attacker can do a perfect impersonation attack.

To a fair extent, the bug fixing process has been a cross product of finding all of the fields that can be exploited this way and finding all of the programs and libraries that made this mistake with them.

(Answer: more fields than you might expect and a lot of programs. If you have a SSL-using program that cares about certificates, audit it now, and for all certificate fields.)

BeSureItsACString written at 01:36:33; Add Comment

2009-09-03

Programming blindness and security

Here is a thought and a theory that has recently struck me.

One reason that writing secure programs is hard is because programmers have a kind of blindness about our own code. Instead of seeing the code as it really is, we tend to see the code as we imagine it in our heads, and as a result we see what we wrote it to do, not what it actually does.

(This is not usually a literal blindness, although we can do that too, but a disconnect between what the the text says and what we 'know' that it does.)

In ordinary code, this just makes your bugs harder to find (and leaves you wondering how you could have possibly missed such an obvious mistake once you find it). In security sensitive code it leads to holes that you can't see because of course you didn't intend to create holes. If you wrote code to introspect operations on the fly by loading a file from a directory, you don't see that it could load a file from anywhere with judicious use of ../ in the filename, because it's not what you wrote the code to do. Of course the code doesn't let you load arbitrary files because you didn't write the code to load arbitrary files, you wrote it to load files from a directory.

Effective debugging in general requires being able to escape this blindness, but at least you have an initial shove in that you know that there's a bug in your code. Checking for security holes is even harder because there is nothing obviously wrong, nothing to kick you out of your mindset of blindness and get you to take a second look.

This leads to obvious but interesting (to me) thoughts about the effectiveness of things like pair programming, code reviews, and security audits. From the angle of this theory, these work in part because they expose your code to people who have less of a chance of being caught up in the blindness.

(I suspect that this is not an original thought.)

ProgrammingBlindness written at 00:13:07; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.