Wandering Thoughts archives

2016-01-22

Memory-safe languages and reading very sensitive files

Here is an obvious question: does using modern memory safe languages like Go, Rust, and so on mean that the issues in what I learned from OpenSSH about reading very sensitive files are not all that important? After all, the fundamental problem in OpenSSH came from C's unsafe handling of memory; all of the things I learned just made it worse. As it happens, my view is that if you are writing security sensitive code you should still worry about these things in a memory safe language, because there are things that can still go wrong. So let's talk about them.

The big scary nightmare is a break in the safety of the runtime and thus the fundamental language guarantees, resulting in the language leaking (or overwriting) memory. Of course this is not supposed to happen, but language runtimes (and compilers) are written by people and so can have bugs. In fact we've had a whole series of runtime memory handling bugs in JIT'd language environments that caused serious security issues; there have been ones in JavaScript implementations, the JVM, and in the Flash interpreter. Modern compiled languages may be simpler than these JIT environments, but they have their own complexities where memory bugs may lurk; Go has a multi-threaded concurrent garbage collector, for example.

I'm not saying that there will be a runtime break in your favorite language. I'm just saying that it seems imprudent to base the entire security of your system on an assumption that there will never be one. Prudence suggests defense in depth, just in case.

The more likely version of the nightmare is a runtime break due to bugs in explicitly 'unsafe' code somewhere in your program's dependencies. Unsafe code explicitly can break language guarantees, and it can show up in any number of places and contexts. For example, the memory safety of calls into many C libraries depends on the programmer doing everything right (and on the C libraries themselves not having memory safety bugs). This doesn't need to happen in the code you're writing; instead it could be down in the dependency of a dependency, something that you may have no idea that you're using.

(A variant is that part of the standard library (or package set) either calls C or does something else marked as unsafe. Go code will call the C library to do some sorts of name resolution, for example.)

Finally, even if the runtime is perfectly memory safe it's possible for your code to accidentally leak data from valid objects. Take buffer handling, for example. High performance code often retains and recycles already allocated buffers rather than churning the runtime memory allocator with a stream of new buffer allocations, which opens you up to old buffer contents leaking because things were not completely reset when one was reused. Or maybe someone accidentally used a common, permanently allocated global temporary buffer somewhere, and with the right sequence of actions an attacker can scoop out sensitive data from it. There are all sorts of variants that are at least possible.

The good news is that the scope of problems is narrower at this level. Since what you're leaking is the previous state of a specific valid object, an attacker needs the previous state of that sort of object to be something damaging. You don't get the kind of arbitrary crossovers of data that you do with full memory leaks. Still, leaks in a central type of object (such as 'generic byte buffers') could be damaging enough.

The bad news is that your memory safe language cannot save you from this sort of leak, because it's fundamentally an algorithmic mistake. Your code and the language is doing exactly what you told it to, and in a safe way; it is just that what you told it to do is a bad idea.

(Use of unsafe code, C libraries, various sorts of object recycling, and so on is not necessarily obvious, either. With the best of intentions, packages and code that you use may well hide all of this from you in the name of 'it's an implementation detail' (and they may change it over time for the same reason). They're not wrong, either, it's just that because of how you're using their code it's become a security sensitive implementation detail.)

SafeReadingInSafeLanguages written at 02:24:08; Add Comment

2016-01-15

Things I learned from OpenSSH about reading very sensitive files

You may have heard that OpenSSH had an exploitable issue with some bad client code (which is actually two CVEs, CVE-2016-0777 and CVE-2016-0778). The issue was reported by Qualys Security, who released a fascinating and very detailed writeup on the issues. While the direct problem is basically the same as in Heartbleed, namely trusting an attacker-supplied length parameter and then sending back whatever happened to be sitting in memory, Qualys Security identified several issues that allowed private keys to leak through this issue despite OpenSSH's attempts to handle them securely. The specific issues are also fascinating in how they show just how hard it is to securely read sensitive files.

So here is what I have learned from OpenSSH about this:

  • Do not use any sort of library level buffered IO. OpenSSH read private keys with stdio, which left copies of them behind in stdio buffers that were later free()'d. If your data is sensitive enough that you are going to explicitly erase it later, you must insure that it never passes through buffers that you do not control (and then you zero the buffers afterwards).

    (I suspect that I have Go code that fumbles this, although doing this in Go in general is at least a bit tricky.)

  • Do not use any convenient form of memory or buffer handling that magically reallocates a new buffer and copies data when you ask a buffer to grow. The other way OpenSSH leaked private keys into memory was through realloc(), which may of course free the buffer you handed it and give you another one.

    There are all sorts of convenient auto-growing buffers and objects in all sorts of languages that are going to be vulnerable to this. You need to avoid them all. Once again, explicitly handling all storage yourself is required (and explicitly erasing all of it).

  • Do not use general but over-powerful facilities in security sensitive code. OpenSSH apparently used a general 'load keypair' function that read the private key too even though it only needed the public key, which resulted in OpenSSH bringing into memory (and exposing) the private keys for all public keys it checked, not just the private key of the public key it was going to use with the server.

    It's easy to see how this happened, and strictly from a programming perspective it's the right answer. We have to load keypairs some times, so rather than have a 'load keypair' function and a 'load public key' function and so on, you just have a fully general 'load keypair' function and throw away the parts of results you don't need. But the result is that we loaded key data we didn't need into memory and then it leaked.

    (Admittedly the issue in OpenSSH is somewhat complex, since you can remove the .pub file and force it to decrypt your private key file to recover the public key (see the comments for this entry).)

(There is also an additional issue where overly clever C compilers can eliminate 'unnecessary' memset() operations that are supposed to erase the key data.)

The thing that scares me is that all of these are really easy mistakes to make, or rather really easy issues to overlook. I could have written code that did all three of them without even thinking about it, and I might have done so even if I was also writing code to carefully erase key data later.

Part of how they are so easy to overlook is that we are trained to think in terms of abstractions, not the details underneath them. Stdio gives you efficient IO (and maybe you remember that it involves buffering), realloc() just magically grows the allocated space (and oh yeah, sometimes it gives you a new area of memory, which means the old one got freed), and so on. And in fact all of these would have been harmless or mostly harmless in the absence of the core 'server can get a copy of random memory' bug.

ReadingSensitiveFilesLessons written at 01:30:39; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.