2014-12-15
How a Firefox update just damaged practical security
Recently, Mozilla pushed out Firefox 34 as one of their periodic regular Firefox updates. Unfortunately this shipped with a known incompatible change that broke several extensions, including the popular Flashblock extension. Mozilla had known about this problem for months before the release; in fact the bug report was essentially filed immediately after the change in question landed in the tree, and the breakage was known when the change was proposed. Mozilla people didn't care enough to do anything in particular about this beyond (I think) blacklisting the extension as non-functional in Firefox 34.
I'm sure that this made sense internally in Mozilla and was justified at the time. But in practice this was a terrible decision, one that's undoubtedly damaged pragmatic Firefox security for some time to come. Given that addons create a new browser, the practical effect of this decision is that Firefox's automatic update to Firefox 34 broke people's browsers. When your automatic update breaks people's browsers, congratulations, you have just trained them to turn your updates off. And turning automatic updates off has very serious security impacts.
The real world effect of Mozilla's decision is that Mozilla has now trained some number of users that if they let Mozilla update Firefox, things break. Since users hate having things break, they're going to stop allowing those updates to happen, which will leave them exposed to real Firefox security vulnerabilities that future updates would fix (and we can be confident that there will be such updates). Mozilla did this damage not for a security critical change but for a long term cleanup that they decided was nice to have.
(Note that Mozilla could have taken a number of methods to fix the popular extensions that were known to be broken by this change, since the actual change required to extensions is extremely minimal.)
I don't blame Mozilla for making the initial change; trying to make this change was sensible. I do blame Mozilla's release process for allowing this release to happen knowing that it broke popular extensions and doing nothing significant about it, because Mozilla's release process certainly should care about the security impact of Mozilla's decisions.
Why your 64-bit Go programs may have a huge virtual size
For various reasons, I build (and rebuild) my copy of the core Go
system from the latest development source on a regular basis, and
periodically rebuild the Go programs I use from that build. Recently
I was looking at the memory use of one of my programs with ps
and noticed that
it had an absolutely huge virtual size (Linux ps's VSZ
field)
of around 138 GB, although it had only a moderate resident set size.
This nearly gave me a heart attack, since a huge virtual size with
a relatively tiny resident set size is one classical sign of a
memory leak.
(Builds with earlier versions of Go tended to have much more modest virtual set sizes on the order of 32 MB to 128 MB depending on how long it had been running.)
Fortunately this was not a memory leak. In fact, experimentation
soon demonstrated that even a basic 'hello world' program had that
huge a virtual size. Inspection of the process's /proc/<pid>/smaps
file (cf) showed that basically all of the
virtual space used was coming from two inaccessible mappings, one
roughly 8 GB long and one roughly 128 GB. These mappings had no
access permissions (they disallowed reading, writing, and executing)
so all they did was reserve address space (without ever using any
actual RAM). A lot of address space.
It turns out that this is how Go's current low-level memory management likes to work on 64-bit systems. Simplified somewhat, Go does low level allocations in 8 KB pages taken from a (theoretically) contiguous arena; what pages are free versus allocated is stored in a giant bitmap. On 64-bit machines, Go simply pre-reserves the entire memory address space for both the bitmaps and the arena itself. As the runtime and your Go code starts to actually use memory, pieces of the arena bitmap and the memory arena will be changed from simple address space reservations into memory that is actually backed by RAM and being used for something.
(Mechanically, the bitmap and arena are initially mmap()
'd with
PROT_NONE
. As memory is used, it is remapped with
PROT_READ|PROT_WRITE
. I'm not confident that I understand what
happens when it's freed up, so I'm not going to say anything there.)
All of this is the case for the current post Go 1.4 development version of Go. Go 1.4 and earlier behave differently with much lower virtual sizes for running 64-bit programs, although in reading the Go 1.4 source code I'm not sure I understand why.
As far as I can tell, one of the interesting consequences of this is that 64-bit Go programs can use at most 128 GB of memory for most of their allocations (perhaps all of them that go through the runtime, I'm not sure).
For more details on this, see the comments in src/runtime/malloc2.go
and in mallocinit()
in src/runtime/malloc1.go.
I have to say that this turned out to be more interesting and
educational than I initially expected, even if it means that watching
ps
is no longer a good way to detect memory leaks in your Go
programs (mind you, I'm not sure it ever was). As a result, the
best way to check this sort of memory usage is probably some
combination of runtime.ReadMemStats()
(perhaps exposed through
net/http/pprof) and Linux's
smem
program or the like to obtain detailed information on
meaningful memory address space usage.
PS: Unixes are generally smart enough to understand that PROT_NONE
mappings will never use up any memory and so shouldn't count against
things like system memory overcommit limits. However they generally
will count against a per-process limit on total address space, which
likely means that you can't really use such limits and run post 1.4
Go programs. Since total address space limits are rarely used, this
is probably not likely to be an issue.
Sidebar: How this works on 32-bit systems
The full story is in the mallocinit()
comment. The short version
is that the runtime reserves a large enough arena to handle 2 GB
of memory (which 'only' takes 256 MB) but only reserves 512 MB of
address space out of the 2 GB it could theoretically use. If the
runtime later needs more memory, it asks the OS for another block
of address space and hopes that it is in the remaining 1.5 GB of
address space that the arena covers. Under many circumstances the
odds are good that the runtime will get what it needs.