Practical RAID-1 read balancing
If you're doing performance analysis of a RAID-1 setup, one of the interesting questions is 'which drive gets read from when?'
(Measuring how balanced the current IO load is is useful, but it doesn't tell you how a changed load will affect the balance.)
Since seeks are the expensive thing on modern drives, you want your RAID-1 system to deal in requests, not in individual blocks, and to send a sequence of sequential reads off to the same disk. If your IO isn't strictly sequential, ideally the system keeps track of the last known head position for each disk (influenced by writes as well as by reads), and issues to the disk that is some combination of positioned closest and least loaded.
But all of this is theory. What you really need to do is measure, because you can never be sure just what a RAID-1 system is doing. Sometimes it can really surprise you.
For example, I once dealt with a reasonably fancy hardware RAID-10 controller where which disk a read would go to was statically determined, not based on load or anything. The controller divided the RAID-10 array up into slices (64K at the time); reads from odd slices went to the first disk in a mirror pair and reads from even slices went to the second. In extreme cases, half the array could be completely idle while the other half was melting down. (Presumably that didn't happen too often.)
We only found this out by running a test program against an otherwise idle test array and watching the front panel disk lights, as part of trying to sort out how the array distributed IO over the drives in general. Much to our surprise, in one test half the disks went active and the other half didn't. All I can say is thank goodness for front panel lights.
Sidebar: RAID-10 versus RAID-01
Because I always get them confused: RAID-10 is striped mirrors, RAID-0 on top of RAID-1. RAID-01 is the reverse, mirrored stripes, and is much less resilient against failures of two or more disks; RAID-01 dies unless all failed drives are in the same stripe, whereas RAID-10 will survive unless both sides of a single mirror die.
A sad day for SGI: it's now a spammer
I once quite liked SGI and have been following its slow decline with a certain regret. But I did not expect to see this day come; SGI has become a spammer.
Not directly, of course. Large corporations (which SGI still is, sort of) don't spam people directly. Instead they hire specialized places to do this for them, like the well known (one could say 'infamous') Responsys Interact.
SGI might argue that we have previously been an SGI customer so they could spam us with marketing. Well, no, I'm afraid that this excuse no longer flies, especially since we haven't been an SGI customer since before the turn of the century.
And this is a lovely, glowing illustration of why email is now such a hassle. Because I clearly cannot trust a vendor with any long term email address; regardless of why I am getting in touch with them, I need to always, always use a special, just for them email address. And then cancel the address promptly.
Sidebar: the fine details
The marketing spam message had an envelope origin address of
Newsletter@sgi.rsc01.com and came from the machine
at IP address 126.96.36.199, a /24 inside Savvis allocated to
Responsys. I did not bother to read very much of it; my scanning
tools tell me that it includes, among other things, the address
(We don't do 'opt out' things like unsubscribing around here. We just block spam sources. It's both simpler and more reliable.)
In a surprise, it contained URLs pointing to the website images.gyrogroup.com (as well as Responsys's domains rsvp0.net, rsc01.net, and responsys.com, and SGI's own website). This belongs to 'Gyro International', which appears to be some kind of marketing firm. In a lovely irony, their website proudly proclaims (in all caps text that's in a graphic; how accessible):
Gyro integrated brand communications build long lasting profitable relationships between people and brands.
Well. Not exactly in this case. Although if you leave out the 'profitable' it's certainly true; I find spam quite memorable, and it certainly builds one sort of relationship.
A web spider update: not actually Uptilt's web spider
A while back, I wrote an entry about a bad web spider that at the time appeared to belong to Uptilt Inc. About a week after I published the entry, some of the system administration folks from Uptilt stumbled across it and got in touch with me to look into the whole situation.
In fact they were pretty puzzled about the incident, because (as they put it) Uptilt didn't even do outgoing HTTP, much less have a web crawler; their business is based on email. After I provided some additional specific information to them, they worked out what seems to have happened.
According to them, 188.8.131.52/27 didn't actually currently belong to Uptilt. Hurricane Electric had allocated it to them in November 2005, but when they ramped up operations from it in December they found they were getting a lot of emails from it blocked; upon investigation, they found that the subnet had previously been used by New Horizons, a well-known spammer, since 2004 or so (see eg the SPEWS listing). So Uptilt asked HE for a new clean netblock, and told HE to take back 184.108.40.206/27. However, neither the ARIN WHOIS information nor some of Uptilt's own records got updated at that time.
Once the Uptilt Inc people worked out what was going on, they got in touch with HE to get the WHOIS information corrected (I expect they also made sure all of their internal records got corrected). Unfortunately, the updated WHOIS information is now completely generic, just showing Hurricane Electric's /18 with no delegation information. Also, the Uptilt people were unable to get HE to tell them who the netblock is now assigned to.
There's a lesson in here about making sure that records, even your own records, are up to date. I've certainly seen similar things happen with internal records here. (In fact back in August I wrote about the accuracy problems of non-essential information.)
A surprising effect of RAID-1 resynchronization
Today I got to run into an interesting performance impact of having a RAID-1 mirror resync running on a big partition of a live system.
An important system was having performance problems today, so we were poking around it. When we watched the disk statistics, we noticed that only the first disk was seeing read traffic; the second disk was loafing along with just occasional bursts of writes. Looking more closely we noticed that a RAID-1 resync of a big partition was in progress; because the system was loaded, the resync's IO bandwidth had been choked and it hadn't gotten very far, only 5% or so in a 100G partition.
Then the light dawned. Normally, reads are distributed over both sides of a RAID-1 mirror. However, at the moment only 5% of the second disk was valid; a read for something in the remaining 95% could only be be done by the first disk. No wonder the first disk was running hot and the second disk was seeing virtually no reads.
Like everybody, I already knew about the direct IO impact of a RAID-1 resync. But the choking effect of not being able to read from both disks for most of the filesystem hadn't previously occurred to me.
Sidebar: what's a RAID-1 resync?
A RAID-1 resync is what happens when the two disks in a RAID-1 mirror cease to be identical copies of each other, usually due to some calamity (power loss, system crash, disk failure). When this happens, one of the mirrors is identified as the most up to date and its data gets dumped to the other disk to bring them back into sync.
The obvious effect of a RAID-1 resync is that it adds extra IO to the system: reads on the first disk, writes on the second disk. However, any decent RAID system has various things to limit this IO so that it happens more or less when the disks are idle and doesn't steal IO bandwidth from real work.