2018-04-20
The increasingly surprising limits to the speed of our Amanda backups
When I started dealing with backups the slowest part of the process was generally writing things out to tape, which is why Amanda was much happier when you gave it a 'holding disk' that it could stage all of the backups to before it had to write them out to tape. Once you had that in place, the speed limit was generally some mix between the network bandwidth to the Amanda server and the speed of how fast the machines being backed up could grind through their filesystems to create the backups. When networks moved to 1G, you (and we) usually wound up being limited by the speed of reading through the filesystems to be backed up.
(If you were backing up a lot of separate machines, you might initially be limited by the Amanda server's 1G of incoming bandwidth, but once most machines started finishing their backups you usually wound up with one or two remaining machines that had larger, slower filesystems. This slow tail wound up determining your total backup times. This was certainly our pattern, especially because only our fileservers have much disk space to back up. The same has typically been true of backing up multiple filesystems in parallel from the same machine; sooner or later we wind up stuck with a few big, slow filesystems, usually ones we're doing full dumps of.)
Then we moved our Amanda servers to 10G-T networking and, from my perspective, things started to get weird. When you have 1G networking, it is generally slower than even a single holding disk; unless something's broken, modern HDs will generally do at least 100 Mbytes/sec of streaming writes, which is enough to keep up with a full speed 1G network. However this is only just over 1G data rates, which means that a single HD is vastly outpaced by a 10G network. As long as we had a number of machines backing up at once, the Amanda holding disk was suddenly the limiting factor. However, for a lot of the run time of backups we're only backing up our fileservers, because they're where all the data is, and for that we're currently still limited by how fast the fileservers can do disk IO.
(The fileservers only have 1G network connections for reasons. However, usually it's disk IO that's the limiting factor, likely because scanning through filesystems is seek-limited. Also, I'm ignoring a special case where compression performance is our limit.)
All of this is going to change in our next generation of fileservers, which will have both 10G-T networking and SSDs. Assuming that the software doesn't have its own IO rate limits (which is not always a safe assumption), both the aggregate SSDs and all the networking from the fileservers to Amanda will be capable of anywhere from several hundred Mbytes/sec up to as much 10G bandwidth as Linux can deliver. At this point the limit on how fast we can do backups will be down to the disk speeds on the Amanda backup servers themselves. These will probably be significantly slower than the rest of the system, since even striping two HDs together would only get us up to around 300 Mbytes/sec at most.
(It's not really feasible to use a SSD for the Amanda holding disk, because it would cost too much to get the capacities we need. We currently dump over a TB a day per Amanda server, and things can only be moved off the holding disk at the now-paltry HD speed of 100 to 150 Mbytes/sec.)
This whole shift feels more than a bit weird to me; it's upended my perception of what I expect to be slow and what I think of as 'sufficiently fast that I can ignore it'. The progress of hardware over time has made it so the one part that I thought of as fast (and that was designed to be fast) is now probably going to be the slowest.
(This sort of upset in my world view of performance happens every so often, for example with IO transfer times. Sometimes it even sticks. It sort of did this time, since I was thinking about this back in 2014. As it turned out, back then our new fileservers did not stick at 10G, so we got to sleep on this issue until now.)
Spam from Yahoo Groups has quietly disappeared
Over the years I have written several times about what was, at the time, an ongoing serious and long-term spam problem with email from Yahoo Groups. Not only was spam almost all of the Groups email that we got, but it was also clear that Yahoo Groups was allowing spammers to create their own mailing lists. I was coincidentally reminded of this history recently, so I wondered how things were today.
One answer is that spam from Yahoo Groups has disappeared. Oh, it's not completely and utterly gone; we rejected one probable spam in last December and two at the end of July 2017, which is almost as far back as our readily accessible logs go (they stretch back to June 15th, 2017). But for pretty much anyone, much less what it was before, that counts as completely vanished. Certainly it counts for not having any sort of spam problem.
But this is the tip of the iceberg, because it turns out that email volume from Yahoo Groups has fallen off significantly as well. We almost always get under ten accepted messages a day from Yahoo Groups, and some days we get none. Even after removing the spam, this is nothing like four years ago in 2014, when my entry implies that we got about 22 non-spam messages a day from Yahoo Groups.
At one level I'm not surprised. Yahoo has been visibly and loudly dying for quite a while now, so I bet that a lot of people and groups have moved away from Yahoo Groups. If you had an active group that you cared about, it was clearly time to find alternate hosting quite some time ago and probably many people did (likely with Google Groups). At another level, I'm a bit surprised that it's this dramatic a shift. I would have expected plenty of people and groups to stick around until the very end, out of either inertia or ignorance. Perhaps Yahoo Groups service got so bad and so unreliable that even people who don't pay attention to computer news noticed that there was some problem.
On the other hand there's another metric, the amount of email from Yahoo Groups that was rejected due to bad destination addresses here (and how many different addresses there are). We almost always seen a small number of such rejections a day, and the evidence suggests that almost all of them are for the same few addresses. There are old, obsolete addresses here that have been rejecting Yahoo Groups email since last June, and Yahoo Groups is still trying to send email to them. Apparently they don't even handle locally generated bounces, never mind bounces that they refuse to accept back. I can't say I'm too surprised.
Given all of this I can't say I regret the slow motion demise of Yahoo Groups. At this point I'm not going to wish it was happening faster, because it's no longer causing us problems (and clearly hasn't been for more than half a year), but it's also clearly still not healthy. It's just that either the spammers abandoned it too or they finally got thrown off. (Perhaps a combination of both.)