Memory-safe languages and reading very sensitive files

January 22, 2016

Here is an obvious question: does using modern memory safe languages like Go, Rust, and so on mean that the issues in what I learned from OpenSSH about reading very sensitive files are not all that important? After all, the fundamental problem in OpenSSH came from C's unsafe handling of memory; all of the things I learned just made it worse. As it happens, my view is that if you are writing security sensitive code you should still worry about these things in a memory safe language, because there are things that can still go wrong. So let's talk about them.

The big scary nightmare is a break in the safety of the runtime and thus the fundamental language guarantees, resulting in the language leaking (or overwriting) memory. Of course this is not supposed to happen, but language runtimes (and compilers) are written by people and so can have bugs. In fact we've had a whole series of runtime memory handling bugs in JIT'd language environments that caused serious security issues; there have been ones in JavaScript implementations, the JVM, and in the Flash interpreter. Modern compiled languages may be simpler than these JIT environments, but they have their own complexities where memory bugs may lurk; Go has a multi-threaded concurrent garbage collector, for example.

I'm not saying that there will be a runtime break in your favorite language. I'm just saying that it seems imprudent to base the entire security of your system on an assumption that there will never be one. Prudence suggests defense in depth, just in case.

The more likely version of the nightmare is a runtime break due to bugs in explicitly 'unsafe' code somewhere in your program's dependencies. Unsafe code explicitly can break language guarantees, and it can show up in any number of places and contexts. For example, the memory safety of calls into many C libraries depends on the programmer doing everything right (and on the C libraries themselves not having memory safety bugs). This doesn't need to happen in the code you're writing; instead it could be down in the dependency of a dependency, something that you may have no idea that you're using.

(A variant is that part of the standard library (or package set) either calls C or does something else marked as unsafe. Go code will call the C library to do some sorts of name resolution, for example.)

Finally, even if the runtime is perfectly memory safe it's possible for your code to accidentally leak data from valid objects. Take buffer handling, for example. High performance code often retains and recycles already allocated buffers rather than churning the runtime memory allocator with a stream of new buffer allocations, which opens you up to old buffer contents leaking because things were not completely reset when one was reused. Or maybe someone accidentally used a common, permanently allocated global temporary buffer somewhere, and with the right sequence of actions an attacker can scoop out sensitive data from it. There are all sorts of variants that are at least possible.

The good news is that the scope of problems is narrower at this level. Since what you're leaking is the previous state of a specific valid object, an attacker needs the previous state of that sort of object to be something damaging. You don't get the kind of arbitrary crossovers of data that you do with full memory leaks. Still, leaks in a central type of object (such as 'generic byte buffers') could be damaging enough.

The bad news is that your memory safe language cannot save you from this sort of leak, because it's fundamentally an algorithmic mistake. Your code and the language is doing exactly what you told it to, and in a safe way; it is just that what you told it to do is a bad idea.

(Use of unsafe code, C libraries, various sorts of object recycling, and so on is not necessarily obvious, either. With the best of intentions, packages and code that you use may well hide all of this from you in the name of 'it's an implementation detail' (and they may change it over time for the same reason). They're not wrong, either, it's just that because of how you're using their code it's become a security sensitive implementation detail.)

Written on 22 January 2016.
« One example of why I like ZFS on Linux
Browsers are increasingly willing to say no to users over HTTPS issues »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Jan 22 02:24:08 2016
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.