My experiment with Firefox Nightly builds: a failure
Ever since my old Firefox build started crashing and I was forced to update to current versions, I've had serious memory issues with Firefox. I used to be able to leave Firefox running for weeks (or months) with basically stable memory usage. Now, Firefox will steadily bloat up from under a GB of resident memory at its initial steady state to, say, 1.5 GB in a few days at most. Although my current machine has 16 GB of RAM, Firefox progressively gets slower and slower as its resident memory grows; by the time it reaches around 1.5 to 1.6 GB resident the performance is visibly dragging and I have to restart.
Recently I stumbled across this Mozilla blog entry on Firefox memory usage, which discusses how current Firefox builds have changes that reduce memory leaks, especially a drastic reduction in zombie compartments (see this entry for more). Ever since I discovered the verbose about:memory information, I've noticed that I have zombie compartments that linger from my ordinary browsing; the longer I browse, the more zombie compartments build up. A Firefox change that actually dropped zombie compartments seemed very promising, certainly promising enough to build a current version of Firefox and see what happened.
(Thus this is not quite an experiment with the literal Nightly builds, although it should be very close; as far as I understand, they're built from the same source repository (see also) that I was using.)
Unfortunately, the experiment turned out to be mostly a failure, although a sort of interesting one; in some ways Firefox improved but in other ways it got significantly worse. I tweeted a cryptic short form version, and I feel like elaborating on it now.
What improved was Firefox's responsiveness as its resident memory grew. Firefox 12 visibly starts slowing down with as little as 1.2 or 1.3 GB of resident memory; the current Firefox code was still running almost as well as at start when it reached 2 GB or more of resident memory, and it might have kept going even as it bloated more. What did not improve was everything else. I still saw zombie compartments (probably just as many as before) and if anything Firefox memory usage grew faster than under Firefox 12, reaching 2 GB resident in a day or two. But the worse thing was that at home, Firefox would soon get into a state where it was constantly using CPU (apparently talking with the X server). In this state it would not shut down gracefully; I could quit Firefox and it would close all its windows, but the process would not exit and would continue consuming the CPU talking with the X server.
(I had to use '
kill -9' to get it to exit, and this happened more than
once with builds across several days. It was also odd CPU usage; it
showed clearly in
top but did not affect the load average and didn't
lag the X server that I could tell.)
It's possible that the current Firefox codebase will improve as it marches towards release, eliminating the memory bloat and 100% CPU usage while preserving responsiveness as its memory usage grows. I could live with that and it certainly would be an improvement over the status quo. (In some ways, simply eliminating the CPU usage would be a bit of an improvement over the status quo, although I don't like Firefox consuming several GB of my RAM for no good reason.)
(Despite the result, I don't regret doing this experiment; it was worth trying and it didn't particularly explode in my face.)
Update, May 17th: It seems that most of my Nightly memory problems were probably due to a single old extension I was using. See this update.
Sidebar: dealing with this with Chrome or by disabling extensions
Chrome is not something I consider an acceptable alternative to Firefox, so switching to it is not an option.
One piece of advice the Mozilla people give about this sort of memory bloat is 'disable unnecessary addons'. Well, I don't have any of those; all of the addons I have loaded are ones that I consider either absolutely necessary (to the point where I would not browse without them) or important for how I use Firefox.
(I suppose there's one or two that I don't use very often, like It's All Text!, but it would be actively painful periodically.)
A basic step in measuring and improving network performance
There is a mistake that I have seen people make over and over again when they attempt to improve, tune, or even check network performance under unusual circumstances. Although what set me off now is this well intentioned article, I've seen the same mistake in people setting off to improve their iSCSI performance, NFS performance, and probably any number of other things that I've forgotten by now.
The mistake is skipping the most important basic step of network performance testing: the first thing you have to do is make sure that your network is working right. Before you can start tuning to improve your particular case or start measuring the effects of different circumstances, you need to know that your base case is not suffering from performance issues of its own. If you skip this step, you are building all future results on a foundation of sand and none of them are terribly meaningful.
(They may be very meaningful for you in that they improve your system's performance right now, but if your baseline performance is not up to what it should be it's quite possible that you could do better by addressing that.)
In the very old days, the correct base performance level you could expect was somewhat uncertain and variable; getting networks to run fast was challenging for various reasons. Fortunately those days have long since passed. Today we have a very simple performance measure, one valid for any hardware and OS from at least the past half decade if not longer:
Any system can saturate a gigabit link with TCP traffic.
As I've written before in passing, if you have two machines with gigabit Ethernet talking directly to each other on a single subnet you should be able to get gigabit wire rates between them (approximately 110 MBytes/sec) with simple testing tools like ttcp. If you cannot get this rate between your two test machines, something is wrong somewhere and you need to fix it before there's any point in going further.
(There are any number of places where the problem could be, but one definitely exists.)
I don't have an answer for what the expected latency should be (as
measured either by
ping or by some user-level testing tool), beyond
that it should be negligible. Our servers range from around 150
microseconds down to 10 microseconds, but there's other traffic going
on, multiple switch hops, and so on. Bulk TCP tends to smooth all of
that out, which is part of why I like it for this sort of basic tests.
As a side note, a properly functioning local network has basically no packet loss whatsoever. If you see any more than a trace amount, you have a problem (which may be that your network, switches, or switch uplinks are oversaturated).
The one area today where there's real uncertainty in the proper base performance is 10G networking; we have not yet mastered the art of casually saturating 10G networks and may not for a while. If you have 10G networks you are going to have to do your own tuning and measurements of basic network performance before you start with higher level issues, and you may have to deliberately tune for your specific protocol and situation in a way that makes other performance worse.