Simple availability doesn't capture timing and the amount of warning
Here is a mistake that I have actually kind of made: a simple availability or 'amount of downtime' number does not fully capture your availability situation. In real life it matters a lot both when you go down and whether or not you have advance warning. To put it simply, an hour of planned downtime at 6pm is qualitatively different from an hour of unplanned downtime at 6pm (or at 11am on your busiest morning) even if they have exactly the same effect on your overall availability numbers.
(I've sometimes seen availability numbers cited as excluding planned downtimes. That strikes me as disingenuous unless it comes with very careful disclaimers and a bunch of additional information.)
Of course it's better to not have the downtime at all, but if you're going to have it it's generally quite worthwhile to transform an unplanned downtime into a planned one (often even if the planned downtime is longer). There is a surprising amount of technology that effectively exists to do this conversion; for example, any non-hotswappable form of redundancy.
(If you have some form of redundancy that you can't hotswap and one half of it breaks (so now you have no redundancy), you're going to have to eventually take things down to restore the redundancy. This shifts the unplanned downtime of losing your only whatever-it-is to the planned downtime of replacing one.)
Sidebar: UPSes in this view
If you have a perfect UPS and no source of alternate or additional power (a redundant power supply, a transfer switch, etc), you're likely converting unplanned power failures into planned UPS battery replacements. In real life UPSes have been known to cause problems and it's usually not that difficult to have power redundancy. Overall a good setup probably simply decreases the chances of unplanned downtimes.
(Our UPSes exist not to prevent unplanned downtimes from power loss but to hopefully prevent unplanned downtimes from ZFS pool corruption due to power loss. This gives me an odd perspective on UPS issues.)
Comments on this page:Written on 31 August 2013.