DevOps and the blame problem: an outsider's view

September 26, 2011

I'm an outsider on the whole DevOps movement (we don't have anything like a traditional operations or development environment here), but from my outside perspective it looks like DevOps is really an attempt to deal with the blame problem.

The traditional organization's approach is to blame operations when services aren't running (or aren't running well enough) and to blame development when features aren't delivered. When you blame devs for not delivering features, well, your devs deliver features but not necessarily things like stability and performance. Since ops is not stupid, it then does its best to refuse to install or change things from development, or even let them near what it'll get blamed for.

(Even if ops is either stupid or optimistic, it does not take very many rounds of 'do thing for devs, world explodes, get yelled at' for the negative conditioning to sink in.)

You can tell development that stability, performance, installability and so on are important too. But it doesn't help; when you blamed devs for not delivering features, you told them what their first and foremost priority was and you're going to get exactly what you asked for. Equally, you can tell ops that being responsive to development is important (either directly or by invoking the good of the whole business) but when you blamed them for services dying you told them what their big priority was. This is not surprising in the least; people are very good and very motived at not getting yelled at.

(Some people think that they can fix the ops side of the problem by blaming ops both if services go down and if development updates aren't deployed promptly. This is a great way to lose anyone in ops who's smart enough to realize that they've been given all of the responsibility and none of the power. Or to put it the other way: people who get yelled at all the time quit.)

At its best, DevOps transforms this to 'devops gets blamed when features aren't there and reliable on the site', joining together both things that you actually want. At its worst, DevOps at least gives the sucker with all of the responsibility some power as well.

(See also Ted Dziuba.)

Sidebar: why ops gets the short end of the traditional stick

Developers at least have the chance of exceeding expectations and thus earning praise; they can deliver features really fast or they can deliver really impressive features, work beyond what people expected was reasonably possible. And anyone can be impressed by a well executed feature because it's generally quite visible.

Ops, well, how does ops exceed expectations? Ops are like janitors; we assume that clean buildings and working services are the natural state of things. You don't get points for either. It's just your job. But miss a spot and boy, the blame rolls in.

(Ops gets praise in exactly the situations where people understand that something exceptional is going on, which generally requires a disaster. Unfortunately this breeds a tendency towards 'heroism'.)


Comments on this page:

From 173.164.130.93 at 2011-09-27 14:49:48:

My take is that DevOps Means Don't Be An A-Hole. That's kind of a compliment to what you are saying I think - devops is about improving communications.

Phil Hollenback

By trs80 at 2011-09-27 20:44:55:

Now, if only Firefox could embrace devops instead of pushing out a release with new bugs every 6 weeks.

By cks at 2011-09-28 11:40:04:

I disagree that DevOps is just about improving communication, because I think that there's multiple levels of DevOps problems that organizations can have. Since the full argument got long I put it in an entry, DevopsProblemLevels.

From 68.183.236.187 at 2011-09-29 17:13:58:

i'm thinking devops is the gateway concept to realizing ops itself is a scam. ops folks are hired when dev is tired of fiddling with machines. dev decides that's not it's core competency. in reality, the machines matter, and if dev can't keep a handle on them then it's on the way to becoming part of a non-tech org. ops should really be a specialist type of dev, not a human proxy for a pager (basically what tedd said).

Scott Dworkis

From 194.203.208.142 at 2011-09-30 12:49:23:

Huh

Dev? Ops?

Where the f**k is QA in all of this????

Surely the ones who are having problems with stability, features and everything else are the ones who simply do not TEST anything?

If some of the effort put into blame was invested in full unit tests and a separate QA team with fully automated and properly executed testing, this whole DevOps nonsense would never have taken hold.

Danny

From 68.183.236.8 at 2011-09-30 18:53:12:

sorry for the double post. if there's a moderator here could you please delete the first one?

@Danny for both ops and qa, i think it comes down to things that are hard to automate and scale, so orgs try to throw manpower into it. there are tools for automating both. i don't know the qa tools that well, but ops tools are not easy. the org i want to work for recognizes hard problems and needs for specialists is a core competency.

Scott Dworkis

By cks at 2011-10-01 01:44:25:

Scott: I've removed your first (duplicated) comment using magic site admin powers.

Written on 26 September 2011.
« How we handle iSCSI device names in Solaris
Oracle shows its appreciation for long-term Sun customers again »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Sep 26 22:58:40 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.