"It works on my laptop" is a blame game

August 16, 2020

There is an infamous dialog between developers and operations teams (eg) where the core of the exchange is the developer saying "it works on my laptop" and then the operations team saying "well, pack up your laptop, it's going into production". Sometimes this is reframed as the developer saying "it works on my laptop, deploy it to production". One of many ways to understand this exchange is as a game of who is to blame for production issues.

When the developer says "well it works on my laptop", they're implicitly saying "you operations people screwed up when deploying it". When the operations people say "well pack up your laptop", they're implicitly saying in return "no we didn't, you screwed it up one way or another; either it didn't work or you didn't prepare it for deployment". The developer is trying to push blame to operations and operations is trying to push blame back.

(This exchange is perpetually darkly funny to system administrators because we often feel that we're taking the fall for what are actually other people's problems, and in this exchange the operations people get to push back.)

But the important thing here is that this is a social problem, just like any blame game. Sometimes this is because higher up people will punish someone (implicitly or explicitly) for the issue, and sometimes this is because incentives aren't aligned (which can lead to DevOps as a way to deal with the blame problem).

(This isn't the only thing that DevOps can be for.)

Playing the blame game in real life instead of in funny Internet jokes isn't productive, it's a problem. If your organization is having this dialog for real, it has multiple issues and you're probably going to get caught in the fallout.

(I almost wrote 'you have multiple issues', but it's not your problem, it's the organization's. Unless you're very highly placed, you can't fix these organizational problems, because they point to deep cultural issues on how developers and system administrators view each other, interact with each other, and probably are rewarded.)

Realizing this makes the "it works on my laptop" thing a little less funny and amusing to me, and a bit sadder and darker than it was before.


Comments on this page:

By Albert at 2020-08-16 06:33:29:

Using reproducible environments (eg containers), so things can be run in the same exact environment both in the developer's laptop and in production helps a bit.

By Joseph at 2020-08-16 10:59:18:

I’ve been doing SRE/operations work for almost a decade now. I can definitely relate to this.

In the last 5 years this issue has become a lot less contentious all because of docker. Docker was the first open source tool that provided an easy to use API for a developer to create a statically compiled binary for any software written in any language for Linux OS.

It turns out the following is much easier for operations to deploy:

1. Take the software and an entire operating system and shove it into tarball 2. Copy the tarball to the Linux server 3. Extract the tarball on Linux server. 4. If service, start up with systemd. If tool, execute it.

Now this has a variety of problems: disk storage, patching security vulnerabilities, etc. but operations and developers don’t care because being able to just copy a file to a server and being able to run it is a game changer for operations.

This is why as an operations person I am profoundly hostile to Python, ruby and other languages that don’t give an easy route to statically compiled binaries. Until docker, their packaging and deployment stories were a nightmare and god help you if you needed them to work on windows and Linux.

By contrast, Golang binaries or java fat jars are an operations dream to deploy.

Docker was a sub optimal tool for this use case. I hear build tools like Basel handle this much better. But docker has the mind share and network effects have taken over so it what is.

By Martin Hradil at 2020-08-18 07:53:36:

I believe Docker is the wrong tool for the job, precisely because development and production environments should be different.

Traditionally, having all the development tools in production environments is frowned upon, as is enabling code reload.

Similarly, a production docker environment is hard to impossible to debug once something does break, and being able to touch all the moving parts independently is a must for development.

Written on 16 August 2020.
« Go will inline functions across packages (under the right circumstances)
Important parts of Unix's history happened before readline support was common »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Aug 16 00:00:42 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.