Wandering Thoughts archives


Why I am harsh on Solaris Live Upgrade and similar tools

In the previous entry I noted that one reason I was basically disinterested in Solaris Live Upgrade is that it had hung up when I tested it several years ago (and quite a few patch levels back). This may strike people as a rather harsh reaction to a bug, to which I am going to say: absolutely, but it's the same reaction I have to bugs in any similar tool, regardless of who it's from or what it runs on.

In order to make Solaris Live Upgrade worth using instead of dangerous, it needs to do a great many things right, things that are both complex and down at the heart of the system. LU must not modify my live boot environment (only the selected alternate), it must reliably boot the boot environment I want it to, it must correctly handle falling back to another boot environment, booting an alternate environment must leave my main one completely untouched, and so on. And it must get these things right all the time and even in obscure cases, because something we're doing may turn out to be one of those obscure cases; a tool like LU cannot afford to be a 90% tool or even a 95% tool. If LU screws up any of this, I have serious problems; at the worst, I have data loss and major system downtime. Pretty much if LU gets anything wrong, I am better off not using it at all.

There is only so much of this that I can explicitly test, which means that I have to actively trust LU and the people who wrote it to get all of these things right. What happens next is simple: bugs destroy my trust. A bug is a place where LU and its programmers have not gotten it right. Sure, I might be able to work around the bug and get LU going anyways, but if there is a bug in something that I have tested, how can I have any confidence that there aren't other bugs in things that I either haven't tested yet or can't even test at all?

I can't. And without trust in the system, I can't use it at all, not unless I desperately need it and I'm willing to take a significant risk because I have no feasible alternative.

So yes, absolutely I am harsh. For good reason.

(Solaris Live Upgrade isn't the only thing that I have tried, hit a bug in, and abandoned. For example, I would like to be able to trust Linux LVM's pvmove, but I had it lock up on me once close to half a decade ago and I haven't touched it since. Maybe it's better now; I don't care. It's not worth the risk of actual data loss.)

sysadmin/HarshOnSystemTools written at 02:04:08; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.