I've changed my thinking about redundant power supplies
Back almost at the start of Wandering Thoughts, I wrote an entry in which I was pretty negative on redundant power supplies. Since I'm busy specifying redundant power supplies for our new generation of fileserver hardware, I think it's about time I admitted something: now that I'm older and somewhat wiser, I'm changing my mind. Redundant power supplies can be quite worth it. In fact I was at least partially wrong back then.
(In my defense, at the time I had very little experience with decent server hardware for reasons that do not fit in the margins of this entry but boil down to 'hardware budget? what's that?'. In retrospect this shows quite vividly in parts of that old entry.)
It's still true that in theory there are plenty of bits of hardware that can break in your server (and the power supplies in our servers have been very reliable). But in practice we've suffered several power supply failures (especially in our backend disk enclosures) and they are probably either the first or second most common cause of hardware failures around here. Apart from the spinning rust of system drives, those other bits of fragile hardware almost never have failed for us.
(Also, an increasing amount of server hardware effectively has some amount of redundancy for the other breakage-prone parts. For example, the whole system (CPUs included) may be passively cooled through multi-fan airflow; if one fan fails, alarms go off but there's enough remaining airflow and cooling that the system doesn't die.)
There's also an important second thing that redundant power supplies enable for crucial servers: they let you deal easily with various sorts of UPS issues (as I noted in that entry). As we both want UPSes and have had UPS problems in the past, this is an important issue for us. We have a solution now but it adds an extra point of failure; redundant power supplies would let us get rid of it.
There is also a pragmatic side of this. In practice hardware with redundant hot swappable power supplies is almost always simply better built in general (power supplies included). Part of our disk enclosure power supply problems likely come from the fact that the power supplies are generic PC power supplies that have had to power 12 disks on a continuous basis for years. Given our much better experience with server power supplies it seems likely that a better grade of power supply would improve things in general.
(Part of this is probably just that hot-swap server power supplies are less generic and thus more engineered than baseline PC power supplies.)
I'm now all for redundant power supplies in sufficiently important servers. However I'm still not sure that I'd put redundant power supplies into most of our servers unless I got them essentially for free; many of our server are not quite that important and for some we already have server-level redundancy.